Are you talking about multi-line strings? Easy, use triple quotes to start and end them.
s = """ this is a very
long string if I had the
energy to type more and more ..."""
You can use single quotes too (3 of them of course at start and end) and treat the resulting string s
just like any other string.
NOTE: Just as with any string, anything between the starting and ending quotes becomes part of the string, so this example has a leading blank (as pointed out by @root45). This string will also contain both blanks and newlines.
I.e.,:
' this is a very\n long string if I had the\n energy to type more and more ...'
Finally, one can also construct long lines in Python like this:
s = ("this is a very"
"long string too"
"for sure ..."
)
which will not include any extra blanks or newlines (this is a deliberate example showing what the effect of skipping blanks will result in):
'this is a verylong string toofor sure ...'
No commas required, simply place the strings to be joined together into a pair of parenthesis and be sure to account for any needed blanks and newlines.
The column names (which are strings) cannot be sliced in the manner you tried.
Here you have a couple of options. If you know from context which variables you want to slice out, you can just return a view of only those columns by passing a list into the __getitem__
syntax (the []'s).
df1 = df[['a', 'b']]
Alternatively, if it matters to index them numerically and not by their name (say your code should automatically do this without knowing the names of the first two columns) then you can do this instead:
df1 = df.iloc[:, 0:2] # Remember that Python does not slice inclusive of the ending index.
Additionally, you should familiarize yourself with the idea of a view into a Pandas object vs. a copy of that object. The first of the above methods will return a new copy in memory of the desired sub-object (the desired slices).
Sometimes, however, there are indexing conventions in Pandas that don't do this and instead give you a new variable that just refers to the same chunk of memory as the sub-object or slice in the original object. This will happen with the second way of indexing, so you can modify it with the .copy()
method to get a regular copy. When this happens, changing what you think is the sliced object can sometimes alter the original object. Always good to be on the look out for this.
df1 = df.iloc[0, 0:2].copy() # To avoid the case where changing df1 also changes df
To use iloc
, you need to know the column positions (or indices). As the column positions may change, instead of hard-coding indices, you can use iloc
along with get_loc
function of columns
method of dataframe object to obtain column indices.
{df.columns.get_loc(c): c for idx, c in enumerate(df.columns)}
Now you can use this dictionary to access columns through names and using iloc
.
Best Solution
There are 2 approaches I propose:
If the range of values is as restricted as you say then using
isin
will be the fastest method:Otherwise we could cast to a str and then call
.isdigit()