flowchart TB
A["Python Data Types"] --> B["Scalars"]
A --> C["Collections"]
B --> B1["int"]
B --> B2["float"]
B --> B3["bool"]
B --> B4["str"]
B --> B5["None"]
C --> D["Ordered"]
C --> E["Unordered"]
D --> D1["list<br/>(mutable)"]
D --> D2["tuple<br/>(immutable)"]
E --> E1["set<br/>(mutable, unique)"]
E --> E2["dict<br/>(mutable, key-value)"]
45 Data types
What This Chapter Covers
Every value in Python has a data type — the type tells Python what operations are valid and how the value is stored. By the end of this chapter you will be able to:
- Identify Python’s fundamental scalar types:
int,float,bool,strandNone. - Work with Python’s collection types:
list,tuple,setanddict. - Distinguish mutable from immutable types, and ordered from unordered collections.
- Use the built-in
type()function to inspect a variable’s type. - Convert between types safely using
int(),float(),str(),list()andtuple(). - Choose the right container (list, tuple, set or dict) for a given analytics task.
45.1 A Map of Python’s Data Types
The diagram below groups Python’s most common data types into scalars (single values) and collections (containers of values). Collections are further split by whether they preserve order and whether they can be modified after creation.
45.2 Fundamental Data Types in Python
45.2.1 Numeric Types
Integers (int)
Integers represent whole numbers — positive, negative or zero — with no fractional component. In modern Python, integers have no fixed size limit: they grow as large as your machine’s memory allows.
Floating Point Numbers (float)
Used for decimal or floating-point numbers. Python’s float is a double-precision (64-bit) number, giving roughly 15–17 significant digits of precision — enough for almost all business analytics work.
45.2.2 Boolean (bool)
- Represents two values:
TrueorFalse. Booleans are integral in control flow and decision-making structures in Python. - Internally,
Trueis1andFalseis0— which means booleans can be summed (e.g.sum([True, False, True]) == 2), a trick often used to count rows that match a condition in a Pandas DataFrame.
Boolean Example
Booleans in Python represent one of two values: True or False. They are typically used in conditional statements. Below is a simple example demonstrating the use of Booleans.
45.2.3 None Type
None is Python’s way of saying “no value”. It is its own type (NoneType) and has exactly one instance — None. You’ll meet it when:
- A function returns nothing explicitly.
- A variable is declared but not yet assigned a meaningful value.
- A dictionary lookup or CSV cell is missing.
To test for None, use is None or is not None — never == — because equality can be overridden by custom classes.
45.2.4 Sequences
45.2.5 Strings (str)
A sequence of characters, enclosed in single, double, or triple quotes. Strings are immutable — once created, their contents cannot be changed (though you can build a new string from them).
45.2.6 Lists
Ordered, mutable collections of items of mixed data types. Lists are the workhorse container in Python — use them whenever you need an ordered, growing sequence.
Lists are mutable: You can change the contents in the list as shown below.
45.2.7 Tuples
Ordered collections like lists, but immutable collections of items of mixed data types.
- Immutable: Once created, the elements of a tuple cannot be changed, added, or removed.
- Ordered: Tuples maintain the order of elements as they were added.
- Indexing and Duplication: Tuples support indexing (you can access elements using their index) and can contain duplicate elements.
- Syntax: Defined by enclosing elements in parentheses (), although parentheses are optional.
Tuples are immutable.my_tuple[0] = 20
Executing the above line will raise an error because changing an element in a tuple is not allowed.
45.2.8 Sets
Unordered collections of unique elements. They are mutable and are useful for operations like union, intersection, and difference.
- Mutable: You can add or remove elements from a set after its creation.
Unordered: Sets do not maintain any order of elements, and thus they don’t support indexing.
No Duplication: Sets cannot contain duplicate elements. Adding a duplicate element will not change the set.
Syntax: Defined by enclosing elements in curly braces
{}.
45.2.9 Dictionaries (dict)
- Key-value pairs that are unordered, mutable, and indexed. Dictionaries are essential for data storage and retrieval operations where relationship mapping is crucial.
- Since Python 3.7, dictionaries preserve insertion order — iterating a dict gives you keys in the order they were added.
45.3 Type Conversion (Casting)
Python lets you convert between compatible types using the built-in constructor functions. These are essential when reading data from CSVs (where everything arrives as strings) or when a function expects a specific type.
| Convert to | Function | Example |
|---|---|---|
| Integer | int() |
int("42") → 42, int(3.9) → 3
|
| Float | float() |
float("3.14") → 3.14 |
| String | str() |
str(100) → "100" |
| List | list() |
list("abc") → ['a', 'b', 'c'] |
| Tuple | tuple() |
tuple([1, 2, 3]) → (1, 2, 3) |
| Set | set() |
set([1, 1, 2, 3]) → {1, 2, 3} |
45.4 Choosing the Right Container
When you have values to store, which container should you use?
| If you need… | Use |
|---|---|
| An ordered collection you will add to or modify | list |
| A fixed group of values that must not change (e.g. coordinates, DB row) | tuple |
| A collection of unique values, with fast membership tests | set |
| A mapping from labels to values (look up by name) | dict |
| A single truth value | bool |
| Counting things | int |
| Measurements with decimals | float |
Quick rule of thumb: start with a list; upgrade to a dict the moment you find yourself writing if name == "alice": ... chains; upgrade to a set when you only care about membership and uniqueness.
45.5 Common Pitfalls with Data Types
-
Comparing floats with
==→ Floating-point math is imprecise.0.1 + 0.2 == 0.3isFalsein Python. Usemath.isclose(a, b)for near-equality checks. -
Mutating a tuple → Tuples are immutable;
my_tuple[0] = 5raisesTypeError. If you need to change values, use a list. - Using a list as a dict key → Keys must be hashable, which means immutable. Lists and dicts can’t be keys; tuples, strings and numbers can.
-
int("3.14")raises an error →int()only parses whole numbers from strings. For decimals, gofloat("3.14")then cast to int if needed. - Sets lose order → Don’t rely on the order of elements when iterating a set. If order matters, use a list.
-
isvs==→ischecks identity (same object in memory);==checks equality (same value). Useis Nonebutx == 5.
Summary
| Concept | Description |
|---|---|
| Foundations | |
| Data Type | A classification that determines what values a variable can hold and what operations are valid on it |
| Dynamic Typing | Python infers a variable's type from its value, so no explicit declaration is needed at assignment |
| type() Function | The built-in type(x) call returns the class of x, which is the standard way to inspect a variable's data type |
| Scalars | |
| Integer (int) | Whole numbers, positive or negative, with no fractional part and no fixed size limit in modern Python |
| Floating Point (float) | Decimal numbers stored in double-precision floating point, suitable for most numerical work |
| Boolean (bool) | A truth value, either True or False, that drives conditional logic and control flow |
| None | The absence of a value, represented by the singleton None; test for it with is None, not == |
| Sequences and Collections | |
| String (str) | A sequence of characters enclosed in single, double or triple quotes, immutable once created |
| List | An ordered, mutable sequence of elements of mixed types, created with square brackets |
| Tuple | An ordered, immutable sequence of elements of mixed types, created with parentheses |
| Set | An unordered collection of unique, hashable elements that supports union, intersection and difference |
| Dictionary (dict) | An unordered mapping of keys to values, optimised for fast lookup by key; preserves insertion order since Python 3.7 |
| Properties to Remember | |
| Mutable vs Immutable | Mutable types (list, set, dict) can be changed in place; immutable types (int, float, str, tuple) cannot |
| Ordered vs Unordered | Lists and tuples preserve insertion order; sets do not expose a meaningful iteration order |
| Indexing | Lists, tuples and strings support positional indexing with x[i], counting from zero |
| Keys and Values | Dictionaries expose .keys(), .values() and .items() methods to iterate over their contents |
| Practical Tools | |
| Type Conversion | Convert between types using int(), float(), str(), list(), tuple() and set() — essential when reading CSV data |
| Choosing a Container | Rule of thumb: list for growing ordered data, tuple for fixed records, set for uniqueness, dict for lookup by name |
| Common Pitfalls | Float equality with ==, mutating tuples, using lists as dict keys, and confusing is with == are classic traps |