Data Types
Python has the following data types built-in by default, in these categories:
- Text Type:
str
- Numeric Types:
int
,float
,complex
- Sequence Types:
list
,tuple
,range
- Mapping Type:
dict
- Set Types:
set
,frozenset
- Boolean Type:
bool
- Binary Types:
bytes
,bytearray
,memoryview
- None Type:
NoneType
You can get the data type of any object by using the type()
function:
Python | |
---|---|
You can copy the code here and run it within the code cell of a Jupyter notebook.
Setting the Data Type
Python is a dynamically typed language, also known as duck typing (if it looks like a duck, walks like a duck, then it is a duck). This means that you do not have to declare the data type of a variable when you create one. This is in contrast to statically typed languages like C, C++, Java, etc., where you have to declare the data type of a variable when you create one.
In Python, the data type is set when you assign a value to a variable:
Lists
A list is a collection that is ordered and changeable. In Python, lists are written with square brackets. Python uses 0-based indexing, meaning the first element in the list is at index 0, this is in contrast to R, which uses 1-based indexing, so the first element of a list in R is at index 1. Demo below:
Python | |
---|---|
Tuples vs Lists
Tuples and lists look similar but have one key difference: Tuples are immutable, meaning you cannot change, add, or remove elements after the tuple is created. Tuples are written with parentheses, and lists are written with square brackets. For example:
Python | |
---|---|
Sets vs Lists
Sets are unordered and unindexed, meaning you cannot access items in a set by referring to an index. Sets are written with curly brackets. You get (amortised) constant time checks for membership. Lists on the other hand require scanning the entire list to check for membership. For example:
Python | |
---|---|
Sets are useful when checking for overlaps between two lists - especially with large datasets.
Dictionaries
Dictionaries are collections that are unordered, changeable, and indexed. In Python, dictionaries are written with curly brackets, and they have keys and values. For example:
Python | |
---|---|
What You Really Need to Know
That was a quick tour of the data types in Python. These are the core Python data types that you will commonly use:
int
- Integerfloat
- Floating Point Numberstr
- Stringlist
- Listdict
- Dictionarybool
- Boolean
Just as a final note, there are other data types that you will encounter when you start doing data science - for example: pandas.DataFrame
, pandas.Series
and datetime.datetime
objects - but this is where you should start.
Finally, remember Python uses 0-based indexing and R uses 1-based indexing.