45  Data types

What This Chapter Covers

Every value in Python has a data type — the type tells Python what operations are valid and how the value is stored. By the end of this chapter you will be able to:

  • Identify Python’s fundamental scalar types: int, float, bool, str and None.
  • Work with Python’s collection types: list, tuple, set and dict.
  • Distinguish mutable from immutable types, and ordered from unordered collections.
  • Use the built-in type() function to inspect a variable’s type.
  • Convert between types safely using int(), float(), str(), list() and tuple().
  • Choose the right container (list, tuple, set or dict) for a given analytics task.

45.1 A Map of Python’s Data Types

The diagram below groups Python’s most common data types into scalars (single values) and collections (containers of values). Collections are further split by whether they preserve order and whether they can be modified after creation.

flowchart TB
    A["Python Data Types"] --> B["Scalars"]
    A --> C["Collections"]
    B --> B1["int"]
    B --> B2["float"]
    B --> B3["bool"]
    B --> B4["str"]
    B --> B5["None"]
    C --> D["Ordered"]
    C --> E["Unordered"]
    D --> D1["list<br/>(mutable)"]
    D --> D2["tuple<br/>(immutable)"]
    E --> E1["set<br/>(mutable, unique)"]
    E --> E2["dict<br/>(mutable, key-value)"]


45.2 Fundamental Data Types in Python

45.2.1 Numeric Types

Integers (int)

Integers represent whole numbers — positive, negative or zero — with no fractional component. In modern Python, integers have no fixed size limit: they grow as large as your machine’s memory allows.

Floating Point Numbers (float)

Used for decimal or floating-point numbers. Python’s float is a double-precision (64-bit) number, giving roughly 15–17 significant digits of precision — enough for almost all business analytics work.

45.2.2 Boolean (bool)

  • Represents two values: True or False. Booleans are integral in control flow and decision-making structures in Python.
  • Internally, True is 1 and False is 0 — which means booleans can be summed (e.g. sum([True, False, True]) == 2), a trick often used to count rows that match a condition in a Pandas DataFrame.

Boolean Example

Booleans in Python represent one of two values: True or False. They are typically used in conditional statements. Below is a simple example demonstrating the use of Booleans.

45.2.3 None Type

None is Python’s way of saying “no value”. It is its own type (NoneType) and has exactly one instance — None. You’ll meet it when:

  • A function returns nothing explicitly.
  • A variable is declared but not yet assigned a meaningful value.
  • A dictionary lookup or CSV cell is missing.

To test for None, use is None or is not None — never == — because equality can be overridden by custom classes.

45.2.4 Sequences

45.2.5 Strings (str)

A sequence of characters, enclosed in single, double, or triple quotes. Strings are immutable — once created, their contents cannot be changed (though you can build a new string from them).

45.2.6 Lists

Ordered, mutable collections of items of mixed data types. Lists are the workhorse container in Python — use them whenever you need an ordered, growing sequence.

Lists are mutable: You can change the contents in the list as shown below.

45.2.7 Tuples

Ordered collections like lists, but immutable collections of items of mixed data types.
- Immutable: Once created, the elements of a tuple cannot be changed, added, or removed.
- Ordered: Tuples maintain the order of elements as they were added.
- Indexing and Duplication: Tuples support indexing (you can access elements using their index) and can contain duplicate elements.
- Syntax: Defined by enclosing elements in parentheses (), although parentheses are optional.

Tuples are immutable.
my_tuple[0] = 20
Executing the above line will raise an error because changing an element in a tuple is not allowed.

45.2.8 Sets

Unordered collections of unique elements. They are mutable and are useful for operations like union, intersection, and difference.
- Mutable: You can add or remove elements from a set after its creation.

  • Unordered: Sets do not maintain any order of elements, and thus they don’t support indexing.

  • No Duplication: Sets cannot contain duplicate elements. Adding a duplicate element will not change the set.

  • Syntax: Defined by enclosing elements in curly braces {}.

45.2.9 Dictionaries (dict)

  • Key-value pairs that are unordered, mutable, and indexed. Dictionaries are essential for data storage and retrieval operations where relationship mapping is crucial.
  • Since Python 3.7, dictionaries preserve insertion order — iterating a dict gives you keys in the order they were added.

45.3 Type Conversion (Casting)

Python lets you convert between compatible types using the built-in constructor functions. These are essential when reading data from CSVs (where everything arrives as strings) or when a function expects a specific type.

Convert to Function Example
Integer int() int("42") → 42, int(3.9) → 3
Float float() float("3.14") → 3.14
String str() str(100) → "100"
List list() list("abc") → ['a', 'b', 'c']
Tuple tuple() tuple([1, 2, 3]) → (1, 2, 3)
Set set() set([1, 1, 2, 3]) → {1, 2, 3}

45.4 Choosing the Right Container

When you have values to store, which container should you use?

If you need… Use
An ordered collection you will add to or modify list
A fixed group of values that must not change (e.g. coordinates, DB row) tuple
A collection of unique values, with fast membership tests set
A mapping from labels to values (look up by name) dict
A single truth value bool
Counting things int
Measurements with decimals float

Quick rule of thumb: start with a list; upgrade to a dict the moment you find yourself writing if name == "alice": ... chains; upgrade to a set when you only care about membership and uniqueness.


45.5 Common Pitfalls with Data Types

  • Comparing floats with == → Floating-point math is imprecise. 0.1 + 0.2 == 0.3 is False in Python. Use math.isclose(a, b) for near-equality checks.
  • Mutating a tuple → Tuples are immutable; my_tuple[0] = 5 raises TypeError. If you need to change values, use a list.
  • Using a list as a dict key → Keys must be hashable, which means immutable. Lists and dicts can’t be keys; tuples, strings and numbers can.
  • int("3.14") raises an errorint() only parses whole numbers from strings. For decimals, go float("3.14") then cast to int if needed.
  • Sets lose order → Don’t rely on the order of elements when iterating a set. If order matters, use a list.
  • is vs ==is checks identity (same object in memory); == checks equality (same value). Use is None but x == 5.

Summary

Concept Description
Foundations
Data Type A classification that determines what values a variable can hold and what operations are valid on it
Dynamic Typing Python infers a variable's type from its value, so no explicit declaration is needed at assignment
type() Function The built-in type(x) call returns the class of x, which is the standard way to inspect a variable's data type
Scalars
Integer (int) Whole numbers, positive or negative, with no fractional part and no fixed size limit in modern Python
Floating Point (float) Decimal numbers stored in double-precision floating point, suitable for most numerical work
Boolean (bool) A truth value, either True or False, that drives conditional logic and control flow
None The absence of a value, represented by the singleton None; test for it with is None, not ==
Sequences and Collections
String (str) A sequence of characters enclosed in single, double or triple quotes, immutable once created
List An ordered, mutable sequence of elements of mixed types, created with square brackets
Tuple An ordered, immutable sequence of elements of mixed types, created with parentheses
Set An unordered collection of unique, hashable elements that supports union, intersection and difference
Dictionary (dict) An unordered mapping of keys to values, optimised for fast lookup by key; preserves insertion order since Python 3.7
Properties to Remember
Mutable vs Immutable Mutable types (list, set, dict) can be changed in place; immutable types (int, float, str, tuple) cannot
Ordered vs Unordered Lists and tuples preserve insertion order; sets do not expose a meaningful iteration order
Indexing Lists, tuples and strings support positional indexing with x[i], counting from zero
Keys and Values Dictionaries expose .keys(), .values() and .items() methods to iterate over their contents
Practical Tools
Type Conversion Convert between types using int(), float(), str(), list(), tuple() and set() — essential when reading CSV data
Choosing a Container Rule of thumb: list for growing ordered data, tuple for fixed records, set for uniqueness, dict for lookup by name
Common Pitfalls Float equality with ==, mutating tuples, using lists as dict keys, and confusing is with == are classic traps