Top 100 Python Interview Questions
Top 100 Python interview questions covering data types, OOP, functional programming, decorators, concurrency, Django/Flask, and advanced Python internals.
Python is a high-level, general-purpose, interpreted programming language created by Guido van Rossum and first released in 1991. It emphasizes code readability and simplicity, using indentation to define code blocks instead of braces. Python supports multiple programming paradigms: procedural, object-oriented, and functional. It is dynamically typed (types are checked at runtime) and garbage-collected. Python's vast standard library and the PyPI ecosystem (over 400,000 packages) make it suitable for web development, data science, machine learning, automation, scripting, scientific computing, and much more. Python 3 (current) is not backwards-compatible with Python 2 (which reached end-of-life in 2020).
Python has several built-in data types. Numeric: int (unlimited precision integers), float (64-bit floating point), complex (e.g., 3+4j). Text: str (immutable Unicode string). Sequence: list (mutable ordered sequence), tuple (immutable ordered sequence), range (immutable sequence of numbers). Mapping: dict (mutable key-value store, ordered since Python 3.7). Set: set (mutable unordered unique elements), frozenset (immutable set). Boolean: bool (True/False, subclass of int). Binary: bytes, bytearray, memoryview. None: NoneType — represents the absence of a value.
Both lists and tuples are ordered sequences that can hold items of any type. The key difference is mutability. A list ([1, 2, 3]) is mutable — you can add, remove, and change elements after creation. A tuple ((1, 2, 3)) is immutable — once created, it cannot be changed. Tuples are slightly faster than lists, use less memory, and can be used as dictionary keys (since they are hashable). Use tuples for fixed collections of heterogeneous data (like a record or coordinates) and lists for homogeneous sequences that may change. Named tuples (collections.namedtuple) add field names to tuples without the overhead of a full class.
A list is an ordered sequence accessed by integer indices (0-based). It stores items without explicit keys: fruits = ["apple", "banana", "cherry"]; access by index: fruits[0]. A dictionary is an unordered (insertion-ordered in Python 3.7+) mapping that stores key-value pairs: user = {"name": "Alice", "age": 25}; access by key: user["name"]. Dictionaries provide O(1) average lookup time regardless of size (like a hash map). Lists provide O(1) access by index but O(n) search by value. Use lists when the order matters and data is accessed by position; use dictionaries when data has meaningful labels/keys and fast lookup by key is needed.
Python uses indentation (whitespace at the beginning of lines) to define code blocks, instead of curly braces ({}) used in most other languages. All statements in a block must use the same indentation level. PEP 8 (Python's style guide) recommends 4 spaces per indentation level — never mix tabs and spaces (Python 3 raises an error for mixed indentation). The colon (:) at the end of compound statements (if, for, while, def, class) signals that a new indented block follows. While this forces consistent formatting and improves readability, incorrect indentation is a common source of IndentationError and TabError exceptions.
Functions in Python are defined with the def keyword: def greet(name, greeting="Hello"): return f"{greeting}, {name}!". Functions support default parameter values, making parameters optional. Python supports positional and keyword arguments: greet("Alice") or greet(greeting="Hi", name="Alice"). *args collects extra positional arguments into a tuple; **kwargs collects extra keyword arguments into a dict. Functions are first-class objects in Python — they can be assigned to variables, passed as arguments, and returned from other functions. Functions without a return statement implicitly return None. Use docstrings to document functions: """This function greets a person.""" as the first statement.
A list comprehension provides a concise way to create a new list by applying an expression to each item in an iterable, optionally filtering items with a condition. Syntax: [expression for item in iterable if condition]. Examples: squares = [x**2 for x in range(10)], evens = [x for x in range(20) if x % 2 == 0], pairs = [(x, y) for x in range(3) for y in range(3)]. List comprehensions are more readable and often faster than equivalent for loops with append(). Similarly, dict comprehensions: {k: v for k, v in items}, set comprehensions: {x**2 for x in range(10)}, and generator expressions (lazy, memory-efficient): (x**2 for x in range(10)).
A decorator is a function that takes another function as an argument, adds some behaviour, and returns the modified function. The @decorator syntax is syntactic sugar for func = decorator(func). A decorator using functools.wraps: from functools import wraps; def timer(func): @wraps(func) def wrapper(*args, **kwargs): start = time.time(); result = func(*args, **kwargs); print(time.time() - start); return result; return wrapper. Apply: @timer def my_function(): .... Built-in decorators: @staticmethod, @classmethod, @property. Decorators are widely used for logging, timing, authentication, caching (@functools.lru_cache), and input validation. Multiple decorators stack from bottom to top.
In Python, == checks value equality — whether two objects have the same value. is checks identity — whether two variables point to the exact same object in memory (same id()). Example: a = [1, 2, 3]; b = [1, 2, 3]; a == b is True (same values), but a is b is False (different objects). However: a = b = [1, 2, 3]; a is b is True (same object). Due to Python's small integer caching and string interning, is may return True for small integers and some strings — but rely on this only for singletons: x is None, x is True, x is False. Always use == None vs is None — use is None.
A lambda function is a small, anonymous single-expression function defined with the lambda keyword. Syntax: lambda arguments: expression. Example: square = lambda x: x**2; square(5) returns 25. Lambdas are commonly used as quick inline callbacks: sorted(people, key=lambda p: p["age"]), filter(lambda x: x > 0, numbers), map(lambda x: x*2, numbers). Limitations: lambdas can only contain a single expression (no statements, no assignments, no docstrings). For anything non-trivial, use a regular def function instead — it is more readable and testable. In Python, lambdas are most appropriate for simple, one-line operations as arguments to higher-order functions.
*args allows a function to accept any number of positional arguments, collecting them into a tuple: def add(*args): return sum(args) — call as add(1, 2, 3, 4). **kwargs allows any number of keyword arguments, collecting them into a dict: def info(**kwargs): print(kwargs) — call as info(name="Alice", age=25). Both can be used together: def func(*args, **kwargs). Use * to unpack a sequence as positional arguments: func(*[1, 2, 3]). Use ** to unpack a dict as keyword arguments: func(**{"name": "Alice"}). These patterns are essential for writing flexible, forward-compatible APIs and when creating wrapper functions that pass arguments through to another function.
Python uses try-except-else-finally blocks for exception handling. try contains code that might raise an exception. except ExceptionType as e catches a specific exception. Multiple except clauses handle different exception types. except (TypeError, ValueError) as e catches multiple types. Bare except catches all exceptions (avoid this). else block runs only if no exception occurred. finally always runs (used for cleanup — closing files, releasing locks). Raise exceptions with raise ValueError("Invalid input"). Re-raise with raise (no arguments). Python's exception hierarchy starts at BaseException → Exception → specific classes. Common exceptions: ValueError, TypeError, KeyError, IndexError, AttributeError, FileNotFoundError, ZeroDivisionError.
Python offers multiple string formatting approaches. f-strings (Python 3.6+, preferred): f"Hello, {name}! Age: {age + 1}" — fast, readable, and support any expression inside braces. str.format(): "Hello, {}! Age: {age}".format(name, age=25). %-formatting (old style, avoid): "Hello, %s! Age: %d" % (name, age). F-strings also support format specifications: f"{value:.2f}" (2 decimal places), f"{number:,}" (thousands separator), f"{name!r}" (repr), f"{value:>10}" (right-align in 10 chars). Multi-line f-strings work normally inside triple quotes. Nested f-strings: f"{\"yes\" if condition else \"no\"}".
The with statement ensures resources are properly released after use, even if an exception occurs. It is equivalent to try/finally but cleaner. Example: with open("file.txt") as f: data = f.read() — the file is automatically closed when the block exits. Objects that support the context manager protocol implement __enter__() (setup, returns the object) and __exit__() (cleanup, receives exception info). Multiple context managers: with open("a") as fa, open("b") as fb. Create custom context managers using the @contextmanager decorator from contextlib: yield between setup and teardown code. Other common uses: database connections, threading locks (with lock), temporary directory (tempfile.TemporaryDirectory()), and mocking in tests (with unittest.mock.patch()).
Python supports full OOP through classes. Define a class: class Animal: def __init__(self, name): self.name = name; def speak(self): raise NotImplementedError. Instantiate: dog = Animal("Rex"). Inheritance: class Dog(Animal): def speak(self): return "Woof!". Call parent: super().__init__(name). Access modifiers: Python uses naming conventions — public (regular), _protected (single underscore, convention only), __private (double underscore, name mangling to _ClassName__attr). Class methods: @classmethod def create(cls, name): return cls(name). Static methods: @staticmethod def validate(name): return bool(name). Python supports multiple inheritance: class C(A, B) with MRO (Method Resolution Order) using C3 linearization.
A generator is a function that uses yield to produce a series of values lazily — one at a time — instead of returning all values at once. When called, it returns a generator iterator. Each next() call on the iterator runs until the next yield, pausing execution there. Example: def fibonacci(): a, b = 0, 1; while True: yield a; a, b = b, a+b. Iterate: for n in fibonacci(): if n > 100: break; print(n). Generators use almost no memory for large sequences — they compute each value on demand. yield from iterable delegates to a sub-iterator. Generator expressions: (x**2 for x in range(10)). Use generators for large data streams (reading large files line-by-line), infinite sequences, and data pipelines where you do not need all data at once.
Python has many useful built-in functions that require no import. Numeric: abs(), round(), pow(), divmod(), sum(), min(), max(). Type conversion: int(), float(), str(), bool(), list(), tuple(), dict(), set(). Introspection: type(), isinstance(), issubclass(), dir(), vars(), id(), callable(). Iteration: range(), enumerate() (index + value pairs), zip() (combine iterables), map(), filter(), sorted(), reversed(), next(), iter(), len(). I/O: print(), input(), open(). Functional: any(), all(), hash(). Code: eval(), exec(), compile().
This question is specific to Python 2 vs Python 3. In Python 2, range() returned a full list of numbers in memory, while xrange() returned a lazy iterator that generated numbers on demand — much more memory-efficient for large ranges. In Python 3, xrange() was removed, and range() was reimplemented to behave like Python 2's xrange() — it returns a lazy range object that generates numbers on demand without storing them all in memory. So range(1000000) in Python 3 uses only a few bytes regardless of size, whereas in Python 2, range(1000000) would create a list of one million integers. You can still slice a range: range(10)[2:5] returns range(2, 5).
None in Python is a singleton object of type NoneType representing the absence of a value or a null result. It is Python's equivalent of null in other languages. Functions that do not explicitly return a value return None implicitly. Always check for None using is None or is not None — not == None (although it works, is is more Pythonic and slightly faster). None is falsy in boolean context: if not result catches None, but also 0, empty strings, empty lists, etc. Explicitly check: if result is None. Common uses: default parameter values (def func(data=None)), signaling "no result" from a function, and as a sentinel value in optional chaining logic.
A module is a single Python file (.py) that can contain functions, classes, and variables. Import a module: import math or from math import sqrt, pi or import math as m. A package is a directory containing an __init__.py file (can be empty) and multiple modules: from mypackage.utils import helper. The __init__.py is executed when the package is imported. Python's standard library is a collection of built-in modules: os, sys, json, datetime, collections, itertools, functools, pathlib, re, logging. Third-party packages are installed with pip: pip install requests. Namespaces prevent name conflicts between modules. The __name__ == "__main__" guard lets a module run standalone or be imported without executing top-level code.
A virtual environment is an isolated Python environment that has its own Python interpreter, pip, and installed packages — separate from the system Python. This prevents version conflicts between projects (project A needing requests 2.x and project B needing requests 3.x). Create: python -m venv venv. Activate on macOS/Linux: source venv/bin/activate; Windows: venv\Scripts\activate. Deactivate: deactivate. Modern alternatives: virtualenv (faster), conda (for data science, also manages non-Python packages), poetry (dependency management + virtual envs), pipenv. Always add venv/ to .gitignore. Record dependencies: pip freeze > requirements.txt. Install from requirements: pip install -r requirements.txt.
Python comparison operators return booleans: == (equal), != (not equal), <, >, <=, >=. Python supports chained comparisons: 0 < x < 10 (equivalent to 0 < x and x < 10). Logical operators: and, or, not. Python uses short-circuit evaluation: a and b returns a if a is falsy, otherwise b. a or b returns a if truthy, otherwise b. This enables Pythonic patterns: name = input_name or "default". Identity operators: is, is not. Membership operators: in, not in — check membership in sequences, dicts (by key), and sets. Bitwise operators: &, |, ^, ~, <<, >>.
Python has built-in file I/O capabilities. Open a file: f = open("file.txt", "r") — modes: "r" (read), "w" (write, overwrites), "a" (append), "x" (create, fails if exists), "b" (binary), "+" (read+write). Always use the with statement to ensure files are closed: with open("file.txt", "r") as f: content = f.read(). Read methods: read() (entire file), readline() (one line), readlines() (list of lines). Iterate line by line (memory efficient): for line in f. Write: f.write("text"), f.writelines(list). For paths, use pathlib: from pathlib import Path; Path("file.txt").read_text(). Binary files: open with "rb" or "wb" mode. Check existence: Path("file").exists().
When copying mutable objects, you need to distinguish between reference, shallow, and deep copies. A reference copy (b = a) — both variables point to the same object; changes in one affect the other. A shallow copy creates a new object but copies only references to the nested objects: b = a.copy() or b = list(a) or import copy; b = copy.copy(a). The outer object is new, but nested objects are shared. A deep copy recursively copies all nested objects: b = copy.deepcopy(a). Changes to b (including nested) never affect a. Example: if a list contains a list, a shallow copy of the outer list still shares the inner list. Use deep copy when you need a fully independent duplicate of a complex nested structure.
Python uses LEGB scope rule to resolve names: Local (inside current function), Enclosing (outer function for nested functions), Global (module level), Built-in (Python built-ins). Variables assigned inside a function are local by default. To modify a global variable inside a function, declare it with global varname. To modify an enclosing scope variable (in nested functions), use nonlocal varname. Reading a global variable without modifying it does not require global. Creating a variable with the same name as a global shadows it locally — the global is not affected. Best practice: minimize global state; prefer passing values as arguments and returning results. Constants (by convention in ALL_CAPS) are typically defined at the global/module level.
A Python set is an unordered collection of unique, hashable elements — duplicates are automatically removed. Create: {1, 2, 3} or set([1, 1, 2, 3]) → {1, 2, 3}. Create empty: set() (not {}, which creates an empty dict). Operations: add(), remove() (KeyError if missing), discard() (no error if missing), pop() (removes arbitrary element). Set operations: | union, & intersection, - difference, ^ symmetric difference, <= subset check. Membership test is O(1) (hash-based): 3 in my_set. frozenset is an immutable set (hashable, can be a dict key or set element). Use sets for deduplication, fast membership testing, and set algebra operations.
append(item) adds a single item to the end of a list. If the item is a list itself, it is added as a nested list: my_list.append([4, 5]) → [..., [4, 5]] — the length increases by 1. extend(iterable) adds each element of the iterable individually to the end: my_list.extend([4, 5]) → [..., 4, 5] — the length increases by the iterable's length. extend accepts any iterable (list, tuple, string, range). Related: insert(index, item) inserts at a specific position. + operator creates a new list. += modifies in place (equivalent to extend). Use append for single items, extend for multiple items. list1 + list2 creates a new list without modifying either — prefer extend for performance when modifying in place.
enumerate(iterable, start=0) adds a counter to an iterable and returns it as an enumerate object. Instead of: for i in range(len(items)): print(i, items[i]), use: for i, item in enumerate(items): print(i, item). The start parameter changes the initial counter value: enumerate(items, start=1) starts at 1. Enumerate is especially useful when you need both the index and the value in a loop. It returns (index, value) tuples which can be unpacked directly. Works with any iterable: lists, tuples, strings, generators. Convert to a list: list(enumerate(["a", "b", "c"])) → [(0, "a"), (1, "b"), (2, "c")]. Enumerate is the Pythonic alternative to manual index tracking and is preferred over C-style index loops.
zip(iter1, iter2, ...) takes multiple iterables and pairs up elements at the same position, returning an iterator of tuples. Example: names = ["Alice", "Bob"]; ages = [25, 30]; list(zip(names, ages)) → [("Alice", 25), ("Bob", 30)]. Zip stops at the shortest iterable. Use itertools.zip_longest() to continue to the longest, filling missing values with a fillvalue. Unzip a list of tuples: names, ages = zip(*pairs). Zip with enumerate: for i, (name, age) in enumerate(zip(names, ages)). Zip with dict comprehension: dict(zip(keys, values)). Zip is excellent for parallel iteration of related sequences, transposing matrices, and combining data from two lists into a structured format.
Slicing extracts a portion of sequences (lists, tuples, strings). Syntax: sequence[start:stop:step]. All parameters are optional and default to 0, len, and 1. lst[1:4] — elements at indices 1, 2, 3 (stop is exclusive). lst[:3] — first 3 elements. lst[3:] — from index 3 to end. lst[-3:] — last 3 elements. lst[::2] — every other element. lst[::-1] — reversed list (shallow copy). Slicing always returns a new object (for lists). Assign to a slice: lst[1:3] = [10, 20]. Delete a slice: del lst[1:3]. Strings are also sliceable: "Hello World"[6:] → "World". Numpy arrays support multidimensional slicing: arr[0:3, 1:4].
Python dictionaries have rich methods. Access: d["key"] (KeyError if missing), d.get("key", default) (safe, returns default). Modify: d["key"] = value, d.update({"k": "v"}) (merge/update), d.setdefault("key", value) (set only if key absent). Delete: del d["key"], d.pop("key", default) (remove and return), d.popitem() (remove and return last item), d.clear(). Views: d.keys(), d.values(), d.items() — all return dynamic view objects that reflect dict changes. Copy: d.copy() (shallow). Python 3.9+ dict merge: d1 | d2, update: d1 |= d2. collections.defaultdict auto-creates missing keys. collections.Counter is a dict subclass for counting. collections.OrderedDict preserves insertion order (redundant in Python 3.7+).
Both sort iterables but differ in important ways. sorted(iterable, key=None, reverse=False) is a built-in function that returns a new sorted list from any iterable (list, tuple, string, dict). The original is unchanged. list.sort(key=None, reverse=False) is a list method that sorts in-place and returns None. It only works on lists. The key parameter is a function applied to each element before comparison: sorted(words, key=str.lower), sorted(people, key=lambda p: p["age"]). Use operator.itemgetter or attrgetter for faster key extraction. Both use Timsort — a stable sort algorithm (O(n log n), preserves relative order of equal elements). Stability is important for multi-level sorting: sort by name first, then by age.
Type hints (PEP 484, Python 3.5+) allow annotating variable and function types, making code more readable and enabling static analysis tools like mypy. Function annotation: def greet(name: str, age: int = 0) -> str: return f"Hi, {name}". Variable annotation: names: list[str] = []. Optional type: from typing import Optional; def find(id: int) -> Optional[str] (returns str or None). In Python 3.10+: str | None instead of Optional[str]. Common types from typing: List, Dict, Tuple, Set, Union, Callable, Any, TypeVar (now prefer built-ins: list[str], dict[str, int] in Python 3.9+). Type hints are not enforced at runtime — use mypy/pyright for static checking. They dramatically improve IDE autocomplete and catch bugs early.
Python strings have a rich set of methods (all return new strings — strings are immutable). Case: upper(), lower(), title(), capitalize(), swapcase(). Strip: strip(), lstrip(), rstrip(). Find: find(sub) (returns -1 if not found), index(sub) (raises ValueError), count(sub), startswith(prefix), endswith(suffix). Replace: replace(old, new, count). Split/Join: split(sep), rsplit(sep, maxsplit), splitlines(), join(iterable). Check: isalpha(), isdigit(), isalnum(), isspace(), islower(), isupper(). Pad: center(width), ljust(width), rjust(width), zfill(width). Encode: encode("utf-8").
Python classes have three types of methods. Instance methods (default): receive self (the instance) as first argument — can access and modify instance and class state. Class methods decorated with @classmethod: receive cls (the class) as first argument — can access and modify class state but not instance state. Used as alternative constructors: @classmethod def from_string(cls, s): return cls(*s.split(",")). Static methods decorated with @staticmethod: receive no implicit first argument — they are just regular functions namespaced inside the class. They cannot access or modify class or instance state. Use for utility functions related to the class but not dependent on it. Example: @staticmethod def validate(value): return isinstance(value, int) and value > 0.
The @property decorator allows a method to be accessed like an attribute, implementing getter/setter/deleter logic. Define a getter: @property def full_name(self): return f"{self.first} {self.last}" — access as person.full_name (no parentheses). Define a setter: @full_name.setter def full_name(self, value): self.first, self.last = value.split() — assign as person.full_name = "Alice Smith". Define a deleter: @full_name.deleter def full_name(self): del self.first; del self.last. Properties enforce encapsulation while keeping a clean attribute-access API. Use them for computed attributes, validation on assignment, and lazy-loading. Unlike Java's explicit getters/setters, Python properties let you start with a public attribute and later add logic without changing the API.
Both are magic methods for string representation of objects. __repr__ should return an unambiguous representation — ideally one that could recreate the object: def __repr__(self): return f"User(name={self.name!r}, age={self.age})". It is called by repr(obj), in the REPL, and by containers (list elements shown in repr). __str__ should return a human-readable string: def __str__(self): return f"{self.name} (age {self.age})". Called by str(obj), print(obj), and f-strings (f"{obj}"). If only __repr__ is defined, it is used as fallback for __str__. Rule of thumb: implement __repr__ always; add __str__ only when you want a friendlier display format different from repr.
__new__(cls, *args, **kwargs) is called before __init__ — it creates and returns a new instance of the class. Normally you do not override it; it is used for implementing singletons, flyweight patterns, or customizing immutable type creation (subclassing int, str, tuple). __init__(self, *args, **kwargs) initializes the already-created instance — this is the constructor you normally override. __del__(self) is called when the object is about to be garbage collected (finalizer). It is unreliable — do not count on it for critical cleanup (use context managers instead). The sequence: __new__ creates the object → __init__ initializes it → (object used) → __del__ finalizes it. The GC calls __del__ at some point after all references are gone, but timing is not guaranteed.
Python supports multiple inheritance: class C(A, B) inherits from both A and B. The Method Resolution Order (MRO) determines the order in which base classes are searched when looking up a method. Python uses the C3 linearization algorithm. View the MRO: C.__mro__ or C.mro(). Generally, the MRO is: the class itself, then left-to-right through base classes, with each class appearing after all classes that inherit from it. super() follows the MRO — even in multiple inheritance, it does not simply call the parent class's method; it calls the next class in the MRO. This enables cooperative multiple inheritance where all classes in the chain properly call super(). The diamond problem (class D inheriting from B and C which both inherit from A) is handled cleanly by MRO — A is only visited once.
An iterator is an object implementing two methods: __iter__() (returns self) and __next__() (returns the next value, raises StopIteration when exhausted). An iterable is an object implementing __iter__() that returns an iterator (lists, dicts, sets are iterable but not iterators themselves). for item in iterable calls iter(iterable) to get an iterator, then calls next() until StopIteration. Custom iterator: class Counter: def __init__(self, max): self.n = 0; self.max = max; def __iter__(self): return self; def __next__(self): if self.n >= self.max: raise StopIteration; self.n += 1; return self.n. The itertools module provides tools for working with iterators: chain(), cycle(), islice(), takewhile(), groupby().
The functools module provides higher-order functions and tools for working with callables. lru_cache(maxsize=128): memoization decorator — caches function results based on input, O(1) lookup. wraps(wrapped): preserves the original function's metadata (name, docstring) when wrapping it in a decorator. partial(func, *args, **kwargs): creates a new function with some arguments pre-filled: double = partial(multiply, 2). reduce(func, iterable, initial): reduces an iterable to a single value by applying a function cumulatively. total_ordering: given __eq__ and one comparison method, fills in the rest. cached_property: like @property but caches the result on first access. singledispatch: function overloading based on the type of the first argument.
The collections module provides specialized container types. namedtuple: tuple subclass with named fields — Point = namedtuple("Point", ["x", "y"]); p = Point(1, 2); p.x. deque: double-ended queue with O(1) append and popleft — better than list for queue operations. defaultdict: dict that calls a factory for missing keys — defaultdict(list) auto-creates empty lists. Counter: counts hashable objects — Counter("hello") → {"l": 2, "h": 1, "e": 1, "o": 1}; supports most_common(n). OrderedDict: dict that remembers insertion order (less useful in Python 3.7+ where all dicts are ordered). ChainMap: combines multiple dicts into a single view, looking up keys through the chain. UserDict/UserList/UserString: base classes for creating dict/list/string subclasses.
The GIL (Global Interpreter Lock) is a mutex in CPython (the reference Python implementation) that ensures only one thread executes Python bytecode at a time, even on multi-core systems. This simplifies memory management (CPython uses reference counting) but limits true parallel execution of Python code across multiple CPU cores. CPU-bound tasks (heavy computation) do NOT benefit from Python threads due to the GIL. I/O-bound tasks (network requests, disk I/O) DO benefit from threads — the GIL is released during I/O operations. For CPU parallelism, use multiprocessing (separate processes, no GIL) or concurrent.futures.ProcessPoolExecutor. For I/O parallelism, use threading, asyncio, or concurrent.futures.ThreadPoolExecutor. Alternative Python implementations (Jython, PyPy-STM) do not have a GIL.
The threading module provides thread-based concurrency. Create a thread: t = threading.Thread(target=my_function, args=(arg1,)). Start it: t.start(). Wait for completion: t.join(). Due to the GIL, Python threads are best for I/O-bound tasks (network requests, file I/O). Thread safety: use threading.Lock() to protect shared state: with lock: counter += 1. threading.Event for signaling between threads. threading.Semaphore for limiting concurrent access. threading.local() for thread-local storage. ThreadPoolExecutor from concurrent.futures manages a pool of threads: with ThreadPoolExecutor(max_workers=10) as executor: results = executor.map(fetch_url, urls). This is the modern, recommended way to do concurrent I/O in Python.
asyncio is Python's built-in library for writing single-threaded concurrent code using coroutines and an event loop. Define a coroutine with async def: async def fetch(url): response = await aiohttp.get(url); return await response.text(). await suspends the coroutine until the awaited operation completes, allowing other coroutines to run. Run: asyncio.run(main()). Run multiple coroutines concurrently: results = await asyncio.gather(fetch(url1), fetch(url2), fetch(url3)). asyncio is ideal for I/O-bound code with many concurrent operations (HTTP clients, WebSocket servers, database connections). Unlike threads, coroutines have no context-switching overhead. Key packages: aiohttp (async HTTP), asyncpg (async PostgreSQL), aiomysql. FastAPI and Starlette are async Python web frameworks. asyncio is NOT parallel — it is concurrent single-threaded.
The multiprocessing module bypasses the GIL by creating separate OS processes with their own Python interpreter and memory space. For CPU-bound parallelism: from multiprocessing import Pool; with Pool(processes=4) as pool: results = pool.map(cpu_task, data_list). Process communication via Queue, Pipe, and shared memory (Value, Array). Synchronization with Lock, Semaphore, Event. Modern approach: concurrent.futures.ProcessPoolExecutor — higher-level and consistent API with ThreadPoolExecutor. Processes have higher startup overhead than threads and serialization cost for passing data between processes (via pickle). Use multiprocessing for CPU-intensive work: image processing, data analysis, scientific computation. For I/O-bound work, prefer asyncio or threads.
Dataclasses (Python 3.7+, PEP 557) reduce boilerplate for classes that mainly store data. The @dataclass decorator automatically generates __init__, __repr__, and __eq__ based on annotated fields. @dataclass class Point: x: float; y: float; label: str = "origin". Make immutable with frozen=True — generates __hash__ and raises FrozenInstanceError on modification. field(default_factory=list) for mutable defaults. __post_init__ for post-initialization logic. dataclasses.asdict() converts to dict. dataclasses.fields() inspects fields. Compare to: namedtuple (immutable, tuple-based), attrs (third-party, more features), Pydantic (with runtime validation). Dataclasses are excellent for DTOs, configuration objects, and value objects with automatic boilerplate generation.
The pathlib module (Python 3.4+) provides an object-oriented interface to filesystem paths, replacing the older os.path. Create a path: from pathlib import Path; p = Path("/home/user/docs"). Join paths: p / "file.txt" (using the / operator). Read/write text: p.read_text(), p.write_text("content"). Read/write bytes: p.read_bytes(), p.write_bytes(). Inspect: p.exists(), p.is_file(), p.is_dir(), p.suffix (extension), p.stem (name without extension), p.name (filename), p.parent. List contents: list(p.iterdir()), list(p.glob("*.py")), list(p.rglob("*.py")). Create: p.mkdir(parents=True, exist_ok=True). Delete: p.unlink() (file), p.rmdir() (empty dir). Current directory: Path.cwd().
Python's re module provides regular expression support. Common functions: re.match(pattern, string) — matches only at the beginning. re.search(pattern, string) — searches anywhere in the string, returns Match object or None. re.findall(pattern, string) — returns list of all non-overlapping matches. re.finditer(pattern, string) — returns iterator of Match objects. re.sub(pattern, repl, string) — replace matches. re.split(pattern, string) — split by pattern. re.compile(pattern) — compile for reuse. Match groups: re.search(r"(\d+)-(\w+)", text).groups(). Flags: re.IGNORECASE, re.MULTILINE, re.DOTALL. Common patterns: \d (digit), \w (word char), \s (whitespace), .* (greedy any), .*? (lazy any). Always use raw strings (r"pattern") to avoid backslash escaping issues.
Python's built-in json module handles JSON serialization and deserialization. Serialize Python object to JSON string: json.dumps(data, indent=2, sort_keys=True). Deserialize JSON string to Python object: json.loads(json_string). JSON types map to Python: object → dict, array → list, string → str, number → int/float, true/false → True/False, null → None. File I/O: json.dump(data, file_obj) and json.load(file_obj). Custom encoding: subclass json.JSONEncoder and override default() — handle types like datetime, UUID, Decimal. Custom decoding: use the object_hook parameter. For high-performance JSON, consider ujson, orjson, or msgpack. The dataclasses.asdict() function converts dataclasses to dicts before JSON serialization.
Python's logging module provides a flexible, production-ready logging system. Five severity levels: DEBUG, INFO, WARNING (default threshold), ERROR, CRITICAL. Basic usage: import logging; logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s %(message)s"); logging.info("App started"). Use named loggers for per-module control: logger = logging.getLogger(__name__). Handlers control where logs go: StreamHandler (console), FileHandler, RotatingFileHandler, SMTPHandler. Formatters control the format. Filters control which records pass through. Configure via dict: logging.config.dictConfig(config). Best practice: use logging.exception() in except blocks to include the full stack trace automatically. Avoid print() for debugging in production — use logging with appropriate levels.
Python's built-in unittest module provides a test framework. Tests extend unittest.TestCase and define test methods starting with test_. Setup/teardown: setUp() runs before each test, tearDown() after. Class-level: setUpClass(cls) / tearDownClass(cls). Assertions: assertEqual(a, b), assertNotEqual(), assertTrue(), assertFalse(), assertIsNone(), assertIn(item, container), assertRaises(Exception, callable, *args). Run: python -m unittest discover or python -m unittest test_module. Mocking: from unittest.mock import Mock, patch, MagicMock; @patch("module.ClassName") def test(mock_class): .... Modern alternative: pytest — simpler syntax (plain functions, not classes), more powerful fixtures, rich plugin ecosystem, and better output. Most Python projects prefer pytest.
pytest is the most popular Python testing framework. Tests are simple functions starting with test_, using plain assert statements: def test_add(): assert add(2, 3) == 5. Run: pytest (discovers tests automatically). Fixtures provide reusable setup/teardown: @pytest.fixture def db(): conn = create_connection(); yield conn; conn.close() — inject into tests by name. Fixture scopes: function, class, module, session. Parameterize: @pytest.mark.parametrize("x,y,result", [(1,2,3),(4,5,9)]). Marks: @pytest.mark.skip, @pytest.mark.xfail, custom marks. Plugins: pytest-mock (mocking), pytest-cov (coverage), pytest-asyncio (async tests), pytest-django. Pytest automatically provides detailed failure diffs and does not require subclassing TestCase.
requests is the most popular Python HTTP library for making web requests. Install: pip install requests. GET: response = requests.get("https://api.example.com/users"). POST with JSON: requests.post(url, json={"name": "Alice"}). Access response: response.status_code, response.json(), response.text, response.content (bytes), response.headers. Parameters: requests.get(url, params={"page": 1}). Headers: requests.get(url, headers={"Authorization": f"Bearer {token}"}). Timeout: requests.get(url, timeout=10) (always set timeouts). Authentication: requests.get(url, auth=("user", "pass")). Sessions (reuse connections, persist cookies): with requests.Session() as session: session.get(url). For async HTTP, use aiohttp or httpx (which supports both sync and async).
Python dependency management has evolved significantly. pip is the standard package installer. requirements.txt pins dependencies: pip freeze > requirements.txt, install with pip install -r requirements.txt. Poetry is the modern standard: manages dependencies (pyproject.toml), lock file (poetry.lock), virtual env, and building/publishing. Commands: poetry add requests, poetry install, poetry run python app.py. Pipenv combines pip and venv with Pipfile and Pipfile.lock. conda manages both Python packages and non-Python dependencies — popular in data science. pyproject.toml (PEP 518) is the modern standard for project metadata, replacing setup.py. Use pip-tools for reproducible builds: compile requirements with version pinning. Virtual environments isolate project dependencies, preventing conflicts between projects.
The contextlib module provides utilities for creating and working with context managers. @contextmanager: creates a context manager from a generator function using yield — code before yield runs in __enter__, code after in __exit__: @contextmanager def managed_resource(): resource = acquire(); try: yield resource; finally: release(resource). contextlib.suppress(*exceptions): suppress specific exceptions: with suppress(FileNotFoundError): os.remove("file.txt"). contextlib.redirect_stdout(f): redirect stdout to a file object. contextlib.ExitStack: manage multiple context managers dynamically — enter variable numbers of CMs at runtime. contextlib.asynccontextmanager: async version of contextmanager. AbstractContextManager: base class for creating CM classes. Context managers are the Pythonic way to handle resource acquisition and release.
A metaclass is the class of a class — it controls how classes are created, just as classes control how instances are created. In Python, the default metaclass is type. Define a custom metaclass by inheriting from type: class MyMeta(type): def __new__(mcs, name, bases, namespace): return super().__new__(mcs, name, bases, namespace). Apply: class MyClass(metaclass=MyMeta). Metaclasses can: add or modify class attributes, enforce interfaces (raise an error if required methods are not implemented), register classes automatically (plugin systems), and add descriptors. Python uses metaclasses internally for abstract base classes (ABCMeta) and enum creation. Real-world uses: Django's Model system uses metaclasses to create database schemas from field declarations. Rule: if class decorators can solve your problem, use those instead — metaclasses add significant complexity.
Descriptors are objects that define the behavior of attribute access through __get__, __set__, and __delete__ methods. Data descriptors define both __get__ and __set__ (and/or __delete__). Non-data descriptors define only __get__. Data descriptors have priority over instance __dict__. Example: class Validator: def __set_name__(self, owner, name): self.name = name; def __get__(self, obj, type=None): return obj.__dict__.get(self.name); def __set__(self, obj, value): if not isinstance(value, int): raise TypeError; obj.__dict__[self.name] = value. Python's built-in property, classmethod, staticmethod, and super() are all implemented as descriptors. Descriptors are the foundation of Python's attribute access system and enable reusable validation, computed attributes, and lazy loading.
By default, Python objects store instance attributes in a __dict__ dictionary, which has significant memory overhead. Defining __slots__ in a class replaces the per-instance __dict__ with a fixed, compact layout in memory: class Point: __slots__ = ["x", "y"]; def __init__(self, x, y): self.x = x; self.y = y. Benefits: ~50-70% memory savings per instance (critical for classes with millions of instances, e.g., game objects, financial instruments), slightly faster attribute access. Limitations: cannot add attributes not listed in __slots__ (no dynamic attributes), cannot use __weakref__ unless added to slots, complicates multiple inheritance. Use slots for: data-heavy classes where you create many instances, when memory is a concern, and when the set of attributes is fixed. Python's dataclasses support slots with @dataclass(slots=True) (Python 3.10+).
Abstract Base Classes (ABCs) from the abc module define interfaces that subclasses must implement. Use ABCMeta metaclass or inherit from ABC: from abc import ABC, abstractmethod; class Shape(ABC): @abstractmethod def area(self) -> float: pass; @abstractmethod def perimeter(self) -> float: pass. Attempting to instantiate an abstract class raises TypeError. Concrete subclasses must implement all abstract methods. ABCs also support virtual subclassing: Shape.register(MyShape) — tells Python that MyShape is a Shape without inheriting (used for structural subtyping). The collections.abc module defines standard ABCs: Iterable, Iterator, Sequence, Mapping, Callable. These are used by type checkers and isinstance() checks: isinstance([], Sequence) returns True. ABCs bridge the gap between duck typing and strict interface enforcement.
All three create data containers with reduced boilerplate but differ significantly. namedtuple: immutable, tuple-based, memory-efficient, supports unpacking and indexing, no mutability — use for simple read-only records. @dataclass: mutable by default (frozen=True for immutability), supports inheritance, post_init, field() customization, default_factory — use for most data classes. Generates __init__, __repr__, __eq__. Optional: __hash__, __lt__. attrs (third-party): the most powerful — validators, converters, slots support, factory functions, frozen, evolved API — more features than dataclasses with similar syntax. Pydantic: runtime data validation, JSON serialization, schema generation — the go-to for API data validation (used by FastAPI). Choose: namedtuple for simple immutable, @dataclass for general purpose, Pydantic for validated data from external sources.
CPython uses reference counting as the primary memory management strategy — every object has a reference count incremented when referenced and decremented when dereferenced. When the count reaches zero, memory is freed immediately. This handles most cases efficiently with no GC pause. However, reference counting fails for circular references (A → B → A). CPython's supplementary cycle garbage collector (the gc module) periodically identifies and frees circular reference cycles. GC is organized into three generations: gen0 (young objects, collected frequently), gen1, gen2 (old objects, collected rarely). Disable GC for performance-critical code (if no circular references): gc.disable(). Profile memory with tracemalloc. Tools: objgraph (visualize object references), memory_profiler (line-by-line memory usage). Python 3.12 improved GC with the new low-impact incremental collector.
Python's dunder (double underscore) methods enable operator overloading and customization of built-in behaviours. Numeric: __add__ (+), __sub__ (-), __mul__ (*), __truediv__ (/), __floordiv__ (//), __mod__ (%), __pow__ (**), plus reflected (__radd__) and in-place (__iadd__) versions. Comparison: __eq__, __ne__, __lt__, __gt__, __le__, __ge__. Container: __len__, __getitem__, __setitem__, __delitem__, __contains__, __iter__, __next__. Context manager: __enter__, __exit__. Callable: __call__ — makes instances callable. Attribute: __getattr__ (called when attribute not found), __getattribute__ (called for every access). Hash: __hash__ (required for dict keys and set elements).
Python is dynamically typed at runtime (types are checked when code executes), but supports gradual static typing through type annotations and external type checkers. mypy is the most popular static type checker — it checks type annotations without running the code. pyright (Microsoft) is faster and used by Pylance in VS Code. Generic types: list[int], dict[str, Any], tuple[int, str, float]. TypeVar for generics: T = TypeVar("T"); def first(lst: list[T]) -> T: return lst[0]. Protocol (structural subtyping): class Drawable(Protocol): def draw(self) -> None — any class with a draw() method satisfies it, without explicit inheritance. TypedDict: type-checked dicts. Literal: specific literal types. Final: constants. Overload: multiple signatures. Runtime type checking: Pydantic validates types at runtime using annotations.
Advanced decorator patterns expand on the basic wrapper concept. Decorator with arguments requires an extra nesting level: def retry(times=3): def decorator(func): @wraps(func) def wrapper(*args, **kwargs): for i in range(times): try: return func(*args, **kwargs); except Exception: if i == times-1: raise; return wrapper; return decorator. Apply: @retry(times=5) def unreliable_call(). Class-based decorator: implement __init__ (stores args) and __call__ (wraps the function). Stacking decorators: @log @timer @validate def func() — applied bottom to top. Decorator for classes: accepts a class and returns a modified class. Real-world patterns: @app.route("/path") (Flask), @cache, @retry, @authenticate, @validate_schema. The functools.wraps decorator preserves the wrapped function's name, docstring, and attributes — always use it.
Async generators combine async functions and generators — they use async def and yield together. Example: async def paginated_data(url): page = 1; while True: data = await fetch(url, page=page); if not data: break; for item in data: yield item; page += 1. Consume with async for item in paginated_data(url). Async context managers implement __aenter__ and __aexit__ (both coroutines): async with aiofiles.open("file.txt") as f: content = await f.read(). Create with @asynccontextmanager: async def db_transaction(): async with pool.acquire() as conn: try: yield conn; await conn.commit(); except: await conn.rollback(). These patterns are essential for building efficient async data pipelines, streaming API responses, and resource management in async web applications.
Python's import system works through several steps: Python searches sys.modules (cache), then sys.path (list of directories). The __init__.py file marks a directory as a package and is executed when the package is imported — use it to define the public API with __all__ and make submodule contents available at the package level. Relative imports: from . import sibling, from ..parent import module. Lazy imports: defer expensive imports to first use, avoiding slow startup. Custom importers: implement importlib.abc.MetaPathFinder and MetaPathLoader for importing from databases, encrypted files, or URLs. importlib.import_module(name): dynamic import by string name. __all__: controls what from module import * exports. Circular imports: common in large projects — resolve by restructuring, using local imports, or importing inside functions.
Python code optimization covers multiple levels. Algorithmic: choose correct data structures (set for membership O(1) vs list O(n), deque for O(1) head operations). Built-ins: built-in functions (implemented in C) are faster than Python equivalents — use sum(), map(), sorted(). List comprehensions are faster than equivalent for-loops. Generator expressions for memory efficiency. Local variables are faster than global/attribute access — cache frequently accessed attributes in local vars. String joining: use "".join(list) not += in loops. __slots__ for memory-heavy classes. lru_cache/cache for memoization. NumPy for array operations (vectorized C operations). Cython/mypyc: compile Python to C. PyPy: JIT-compiled Python for 5-10x faster execution. Profile before optimizing: cProfile, line_profiler, memory_profiler.
Python's dynamic nature means many classic design patterns are simpler or unnecessary. Singleton: use module-level globals (modules are singletons) or a metaclass. Factory: simply a function or class method returning instances. Observer: Python callbacks, signals libraries, or property setters. Strategy: pass callables/lambdas as arguments — functions are first-class. Decorator: Python's @decorator syntax is specifically designed for this. Iterator: implement __iter__/__next__ or use generators. Context Manager: with statement and contextlib. Command: callables and functools.partial. Composite: Python's duck typing makes this natural. Proxy: __getattr__ forwarding. Pythonic alternatives often replace GOF patterns: protocols replace interfaces, first-class functions replace Strategy/Command, decorators replace Proxy/Wrapper. Focus on idiomatic Python (duck typing, EAFP) rather than forcing Java-style patterns.
Django is Python's most popular full-stack web framework, following the "batteries included" philosophy. It provides: ORM (Django ORM with migrations), admin interface (automatic CRUD UI from models), URL routing, template engine, forms with validation, authentication system (users, groups, permissions), security (CSRF, XSS, SQL injection protection built-in). Project structure: django-admin startproject mysite, python manage.py startapp blog. Define models, create migrations (manage.py makemigrations/migrate), define views (function-based or class-based), configure URLs. Django REST Framework (DRF) adds API serialization. Django channels adds WebSocket support. Django follows MTV (Model-Template-View) which is effectively MVC. Its convention-over-configuration approach means fast development for standard web apps.
Flask is a lightweight, micro web framework for Python. Unlike Django's "batteries included" approach, Flask provides only the core (routing, request handling, templating with Jinja2) and lets you choose your own components. A minimal Flask app: from flask import Flask; app = Flask(__name__); @app.route("/"); def home(): return "Hello, World!". Key components: Blueprints (modular app organization), Flask-SQLAlchemy (ORM), Flask-Migrate (database migrations), Flask-Login (authentication), Flask-WTF (forms), Marshmallow (serialization). The application context (g) and request context (request, session) are thread-local proxies. Flask is excellent for microservices, APIs, and projects where Django's conventions would be overkill. FastAPI is a modern alternative to Flask with automatic API documentation, async support, and Pydantic integration.
Python comprehensions extend beyond simple transformations. Nested comprehensions: [[row[i] for row in matrix] for i in range(len(matrix[0]))] transposes a matrix. Conditional expression in output: [x if x > 0 else -x for x in data] (abs value). Walrus operator in comprehensions (Python 3.8+): [y for x in data if (y := process(x)) is not None] — compute once, use twice. Generator pipelines: chain generator expressions for lazy processing: lines = (l.strip() for l in file); non_empty = (l for l in lines if l); results = (parse(l) for l in non_empty). Dict comprehensions from pairs: {v: k for k, v in original.items()} (invert dict). Set comprehension for deduplication: {item.lower() for item in tags}. Comprehensions are single-pass and generally 30-50% faster than equivalent for-loop + append patterns.
One of Python's most common gotchas: mutable default arguments are evaluated once when the function is defined, not on each call. def append_to(item, lst=[]): — the same list object is reused across all calls without an explicit argument, causing values to persist between calls. Fix: use None as default and create the mutable object inside the function: def append_to(item, lst=None): if lst is None: lst = []; lst.append(item); return lst. This applies to any mutable type as default: lists, dicts, sets. Immutable defaults (int, str, tuple, None) are safe because they cannot be mutated. This behaviour is intentional and documented but surprises nearly every Python beginner. Use functools.lru_cache with caution for similar reasons with mutable cache keys.
The walrus operator (:=), introduced in Python 3.8, is an assignment expression that assigns a value to a variable as part of an expression. It allows assigning and using a value in the same expression. Example in a while loop: while chunk := file.read(8192): process(chunk). In a comprehension: results = [y for x in data if (y := process(x)) > 0] — computes y once and uses it in both the condition and output. In an if statement: if m := re.match(pattern, text): print(m.group()). The walrus operator reduces redundant computation and simplifies certain patterns. Use it sparingly — only when it genuinely improves readability. Never use it just to reduce line count; clarity is more important than brevity in Python.
Structural pattern matching (Python 3.10, PEP 634) adds a match/case statement — more powerful than a simple switch. Example: match command: case "quit": quit(); case "go" if direction: go(direction); case {"action": action, "data": data}: handle(action, data); case _: print("unknown"). Patterns: literal (value equality), capture (bind to variable), class (case Point(x=0, y=y)), sequence (case [first, *rest]), mapping (case {"key": value}), OR (case 400 | 404 | 405), guard (case x if x > 0). The _ is the wildcard (does not bind). Match is an expression of structural pattern matching, not just value switching — it inspects the structure and type of the subject.
The itertools module provides memory-efficient tools for working with iterators. Infinite iterators: count(start, step) (0, 1, 2...), cycle(iterable) (A, B, C, A, B, C...), repeat(obj, times). Combining: chain(a, b, c) (flatten), chain.from_iterable([[1,2],[3,4]]), zip_longest(a, b, fillvalue=None). Filtering: islice(it, stop) (lazy slice), takewhile(pred, it), dropwhile(pred, it), filterfalse(pred, it). Grouping: groupby(it, key) — groups consecutive identical keys. Combinatorics: product(a, b) (Cartesian product), permutations(it, r), combinations(it, r), combinations_with_replacement(it, r). All are lazy (generator-based). Use itertools for memory-efficient data pipelines when working with large datasets.
The __all__ variable is a list of strings defined at the module level that specifies which names should be exported when from module import * is used. Without __all__, all names not starting with an underscore are exported. Example: __all__ = ["public_func", "PublicClass"] — only these names are exported, hiding implementation details. __all__ also serves as documentation of the module's public API and helps IDEs and documentation generators understand what is public. Best practice: always define __all__ in libraries to prevent accidental exposure of private helpers. Note: __all__ only affects from module import * — explicit imports (from module import private_func) are never restricted by __all__. Used alongside _private naming for comprehensive API control.
In multiple inheritance, super() does not simply call the immediate parent class — it follows the MRO (Method Resolution Order) and calls the next class in the MRO chain. This enables cooperative multiple inheritance. Example with diamond inheritance: class A: def method(self): print("A"); class B(A): def method(self): super().method(); print("B"); class C(A): def method(self): super().method(); print("C"); class D(B, C): def method(self): super().method(); print("D"). Calling D().method() prints A, C, B, D — following D→B→C→A MRO. For cooperative MI to work, all classes in the hierarchy must call super(). Arguments must also be consistent throughout the chain — use *args, **kwargs for forward-compatible cooperative methods. This pattern is common in mixin-based frameworks like Django class-based views.
The weakref module provides weak reference objects that refer to an object without increasing its reference count, allowing the object to be garbage collected when no strong references exist. Create: import weakref; obj = MyObject(); ref = weakref.ref(obj); print(ref()) — calling the ref returns the object or None if collected. WeakValueDictionary: a dict that holds weak references to values — entries vanish when values are collected. WeakKeyDictionary: entries vanish when keys are collected. WeakSet: like WeakValueDictionary but a set. Use cases: caches (let entries be freed when not used elsewhere), preventing memory leaks in observer patterns (listener holding reference to subject), and circular references. Contrast: regular references keep objects alive; weak references observe without keeping alive.
The subprocess module runs external commands and system processes from Python. Modern API uses subprocess.run(): result = subprocess.run(["ls", "-la"], capture_output=True, text=True, timeout=30). Access output: result.stdout, result.stderr, result.returncode. Raise exception on failure: check=True raises CalledProcessError if returncode != 0. Pipe input: subprocess.run(["grep", "error"], input="log data", text=True). For long-running processes: use subprocess.Popen directly for streaming I/O. Security: never use shell=True with user-supplied input — leads to shell injection. Prefer list arguments (not strings). Environment variables: env={"PATH": "/usr/bin", **os.environ}. Alternatives: os.system() (deprecated, avoid), sh library (more ergonomic wrapper).
The argparse module parses command-line arguments for Python scripts. Create a parser: parser = argparse.ArgumentParser(description="Process files"); parser.add_argument("filename", help="Input file"); parser.add_argument("-v", "--verbose", action="store_true"); parser.add_argument("-n", "--count", type=int, default=10); args = parser.parse_args(). Access: args.filename, args.verbose, args.count. Argument types: positional (required), optional (-v/--verbose), action="store_true" (flag), choices=["json", "csv"] (restricted values), nargs="+" (one or more). Subparsers for sub-commands: parser.add_subparsers(dest="command"). Auto-generates --help. Alternatives: click (decorator-based, more ergonomic), typer (uses type hints for argument definition — modern favorite).
The __call__ method makes instances of a class callable — you can use them like functions: obj(args). Example: class Multiplier: def __init__(self, n): self.n = n; def __call__(self, x): return x * self.n. Use: double = Multiplier(2); double(5) returns 10. Check if callable: callable(obj). Use cases: function objects with state (like closures but as classes), decorators as classes, partial application, and function factories. Stateful callable classes are often more readable than closures for complex logic. Comparison: a class with __call__ is like a closure — it holds state between calls. functools.partial creates callables with pre-filled arguments. Django middleware, Werkzeug WSGI apps, and many frameworks use callable objects as middleware/handlers.
The pprint module (pretty-print) formats complex Python data structures in a human-readable way. from pprint import pprint; pprint(complex_dict, indent=2, width=80). For nested structures, pprint adds indentation and line breaks at appropriate places. pformat(data) returns the formatted string instead of printing. Key parameters: indent (indentation per level), width (line width limit), depth (max nesting depth before truncating with ...), compact (compact sequences on one line if they fit). Very useful when debugging large API responses, configuration objects, and deeply nested data. In Python 3.8+, prefer json.dumps(data, indent=2) for JSON-compatible data — it is often more readable. The rich third-party library provides even prettier output with syntax highlighting and tree views.
@functools.lru_cache(maxsize=128) is a memoization decorator that caches the results of function calls based on input arguments, returning cached results on subsequent calls with the same arguments. LRU = Least Recently Used — when the cache is full, the least recently used entry is discarded. @lru_cache(maxsize=None) (or @functools.cache in Python 3.9+) has unlimited size. All arguments must be hashable (no lists or dicts). Useful for: recursive algorithms (Fibonacci, factorial — eliminates exponential time complexity), expensive database queries, API calls, and computationally heavy transformations. Check cache stats: func.cache_info() — shows hits, misses, maxsize, currsize. Clear cache: func.cache_clear(). For class methods, use methodtools.lru_cache or use @functools.cached_property for properties.
NumPy is the fundamental package for scientific computing in Python, providing a powerful N-dimensional array object and vectorized mathematical operations. The core is the ndarray — a contiguous block of memory storing homogeneous data. Operations on arrays are vectorized (applied element-wise in C) — 10-100x faster than Python loops. Create: np.array([1, 2, 3]), np.zeros((3, 4)), np.ones(), np.arange(0, 10, 0.5), np.linspace(0, 1, 100). Operations: arr * 2, arr + arr, np.sqrt(arr). Indexing: arr[0, 1], slicing: arr[1:3, :], boolean indexing: arr[arr > 5]. Broadcasting: operations on different-shaped arrays. Reshape: arr.reshape(2, 3). Linear algebra: np.dot(), np.linalg.solve(). NumPy is the foundation of the Python data science stack (Pandas, SciPy, Matplotlib, scikit-learn).
Pandas is Python's primary data manipulation library, built on NumPy. It provides two main data structures: Series (1D labeled array) and DataFrame (2D labeled table — like a spreadsheet or SQL table). Load CSV: df = pd.read_csv("data.csv"). Inspect: df.head(), df.info(), df.describe(). Select: df["col"] (Series), df[["col1", "col2"]] (DataFrame), df.loc[row, col] (label-based), df.iloc[0, 1] (position-based). Filter: df[df["age"] > 25]. Group: df.groupby("city")["salary"].mean(). Merge: pd.merge(df1, df2, on="id"). Handle missing: df.dropna(), df.fillna(0). Apply functions: df["name"].apply(str.upper). Pandas is indispensable for data cleaning, exploration, and transformation in data science workflows.
FastAPI is a modern, high-performance Python web framework for building APIs. It is based on Starlette (ASGI) and Pydantic and uses type hints for automatic request validation, serialization, and documentation. from fastapi import FastAPI; app = FastAPI(); @app.get("/users/{user_id}") async def get_user(user_id: int, include_inactive: bool = False) -> dict: return {"id": user_id}. FastAPI automatically: validates types, generates JSON Schema, and creates interactive API docs at /docs (Swagger UI) and /redoc. Dependency injection: def endpoint(db: Session = Depends(get_db)). Async: use async def for non-blocking endpoints. Pydantic models for request body: class UserCreate(BaseModel): name: str; email: EmailStr. FastAPI is one of the fastest Python frameworks (comparable to Node.js/Go) and has become the industry standard for Python APIs, replacing Flask for many use cases.
Python 3 supports exception chaining — when an exception is raised inside an except block, both exceptions are reported. Implicit chaining: raising a new exception in an except block automatically chains it: try: int("abc"); except ValueError as e: raise RuntimeError("Conversion failed") from e — the original ValueError is the __cause__. Explicit chaining: raise New() from old_exc — makes the cause explicit. Suppress chaining: raise New() from None — hides the original exception context. Access: exc.__cause__ (explicit), exc.__context__ (implicit). In tracebacks, Python shows the full chain: "During handling of the above exception, another exception occurred." This is invaluable for debugging — you see both the original problem and the secondary failure. Use raise New() from original when wrapping low-level exceptions in higher-level abstractions.
Protocol (Python 3.8+, typing.Protocol) enables structural subtyping (duck typing with static type checking). Define a protocol: from typing import Protocol; class Drawable(Protocol): def draw(self) -> None: .... Any class that has a draw() method satisfies this protocol — without explicitly inheriting from it. Type checkers (mypy, pyright) verify structural compatibility. Runtime check: isinstance(obj, Drawable) works if @runtime_checkable is added to the Protocol. Compare to abstract classes: ABCs require explicit registration/inheritance; Protocols check structure only. Use protocols for: accepting any "file-like object" (SupportsRead), "iterable" (Iterable), or domain-specific interfaces without coupling to specific implementations. Python's built-in collections.abc classes are being retrofitted as runtime-checkable protocols. Protocols are the preferred way to express "duck typing" in typed Python code.
__init_subclass__(cls, **kwargs) is a class method called when the class is subclassed, allowing the parent to hook into subclass creation without a metaclass. Example (plugin registry): class Plugin: registry = {}; def __init_subclass__(cls, name="", **kwargs): super().__init_subclass__(**kwargs); if name: Plugin.registry[name] = cls. Subclass: class LogPlugin(Plugin, name="log") — automatically registered. Class decorators are simpler than metaclasses for modifying classes: def add_repr(cls): cls.__repr__ = lambda self: f"{cls.__name__}({vars(self)})" return cls; @add_repr class MyClass: .... Choose: metaclasses for deep class creation control, __init_subclass__ for subclass registration/notification, class decorators for adding/modifying attributes after creation. __init_subclass__ is the modern, simpler alternative to metaclass hooks for most use cases.
Advanced asyncio patterns for production applications. Task management: task = asyncio.create_task(coro()) — runs concurrently; cancel with task.cancel(). Timeout: async with asyncio.timeout(5): await fetch() (Python 3.11+) or await asyncio.wait_for(coro(), timeout=5). Gather with errors: asyncio.gather(*tasks, return_exceptions=True) — returns exceptions as values instead of propagating. Semaphore (limit concurrency): sem = asyncio.Semaphore(10); async with sem: await fetch(). Queue: asyncio.Queue() for producer-consumer patterns. Synchronization: asyncio.Lock(), asyncio.Event(). Background tasks: asyncio.get_event_loop().call_later(5, callback). Running in executor: await loop.run_in_executor(None, blocking_function) — run blocking code in a thread pool without blocking the event loop. Always prefer async libraries (aiohttp over requests, asyncpg over psycopg2) in async code.
Implementing the context manager protocol via __enter__ and __exit__ gives full control over resource management. __enter__(self): called when entering the with block, returns the resource (or self). __exit__(self, exc_type, exc_val, exc_tb): called on exit, receives exception info (all None if no exception). Return True to suppress the exception; return False/None to propagate it. Example: class DatabaseConnection: def __enter__(self): self.conn = connect(); return self.conn; def __exit__(self, et, ev, tb): if et: self.conn.rollback(); else: self.conn.commit(); self.conn.close(); return False. Class-based CMs are useful when __init__ needs arguments or cleanup is complex. For simpler cases, use @contextmanager. Async context managers implement __aenter__ and __aexit__ (both coroutines) for use with async with.
The pickle module serializes Python objects to bytes (marshaling) and deserializes them back. import pickle; data = pickle.dumps(obj) (to bytes) and pickle.loads(data) (from bytes). File I/O: pickle.dump(obj, file) and pickle.load(file). Pickle can serialize almost any Python object — including lambdas, classes, and complex nested structures. Protocol versions (0-5) — higher means more efficient/compact. Security warning: never unpickle data from untrusted sources — a malicious pickle can execute arbitrary code during deserialization. Use JSON, MessagePack, or Protocol Buffers for data exchange with external systems. Legitimate uses: caching ML models (scikit-learn), saving game state, IPC between Python processes (multiprocessing uses pickle). Custom serialization: implement __getstate__ and __setstate__. copyreg module registers custom pickle functions for non-picklable types.
The os module provides a portable way to interact with the operating system. File operations: os.getcwd() (current directory), os.chdir(path), os.listdir(path), os.mkdir(path), os.makedirs(path, exist_ok=True), os.remove(path), os.rename(src, dst). Path: os.path.join("dir", "file.txt"), os.path.exists(), os.path.abspath(), os.path.basename(), os.path.dirname(). Environment: os.environ.get("HOME"), os.environ["PATH"]. Process: os.getpid(), os.system("cmd") (prefer subprocess). The sys module provides Python runtime information: sys.argv (command-line arguments), sys.path (module search path), sys.version, sys.platform, sys.exit(code) (exit program), sys.stdin/stdout/stderr, sys.getrecursionlimit(), sys.getsizeof(obj) (object memory size in bytes).
The copy module provides object copying functions. copy.copy(obj) creates a shallow copy — a new object whose contents are references to the same objects as the original. For containers (lists, dicts), a new container is created but the elements inside are still the same objects. copy.deepcopy(obj) creates a deep copy — recursively copies all nested objects so the copy is completely independent. Deep copy handles circular references automatically using a memo dict. Custom copy behavior: implement __copy__(self) for shallow and __deepcopy__(self, memo) for deep copy. Performance: deep copy is significantly slower than shallow copy. Avoid deep-copying large structures unnecessarily. In practice: for immutable objects (int, str, tuple), copying is unnecessary since they cannot change. For mutable nested structures that you need to modify independently, use deep copy.
The heapq module implements a min-heap using a regular Python list. The heap property guarantees that heap[0] is always the smallest element. Key functions: heapq.heappush(heap, item) — add an item (O(log n)). heapq.heappop(heap) — remove and return the smallest item (O(log n)). heapq.heappushpop(heap, item) — push then pop (more efficient than separate calls). heapq.heapify(list) — convert a list to a heap in O(n). heapq.nlargest(n, iterable) and heapq.nsmallest(n, iterable) — efficiently find the n largest/smallest without full sort. For a max-heap, negate values: heappush(heap, -value). Use heapq for: priority queues (Dijkstra's algorithm, A* search), efficiently finding k-th smallest/largest elements, and any scenario requiring "get the minimum quickly" without full sorting.
The enum module (Python 3.4+) provides a formal way to define named constants. from enum import Enum, auto; class Color(Enum): RED = 1; GREEN = 2; BLUE = 3. Access: Color.RED (member), Color.RED.value (1), Color.RED.name ("RED"). Iterate: list(Color). Check membership: isinstance(Color.RED, Color). auto() assigns sequential values automatically. IntEnum: behaves like int in comparisons. Flag/IntFlag: for bitwise operations (permissions): class Perm(Flag): READ = auto(); WRITE = auto(); EXECUTE = auto(); RW = READ | WRITE. StrEnum (Python 3.11+): string-valued enum. Enums are iterable, comparable, and hashable — use them as dict keys. They replace magic number/string constants with readable, type-safe values. Enums work well with match/case (Python 3.10+) for exhaustive pattern matching.
The bisect module provides binary search functions for maintaining sorted lists in O(log n) time. bisect.bisect_left(lst, x) — returns the leftmost position where x can be inserted to keep the list sorted (elements equal to x go to the right of this position). bisect.bisect_right(lst, x) (same as bisect.bisect()) — returns the rightmost insertion point. bisect.insort_left(lst, x) — inserts x at the correct position (O(log n) search but O(n) insert due to list shifting). Example — grade classification: def grade(score): return "ABCDF"[bisect.bisect([60, 70, 80, 90], score)]. Finding the rank of a value in a sorted list: rank = bisect.bisect_left(sorted_list, value). Use bisect when you have a sorted list and frequently need to find positions or insert values while maintaining order. For frequent insertions, sortedcontainers.SortedList is more efficient.
TypedDict (Python 3.8+, typing.TypedDict) allows defining dict shapes with type annotations, enabling static type checkers to validate dict access. from typing import TypedDict; class User(TypedDict): name: str; age: int; email: str. Using total=False makes all keys optional: class Config(TypedDict, total=False): debug: bool; port: int. Mix required and optional with inheritance: class RequiredFields(TypedDict): name: str; class OptionalFields(RequiredFields, total=False): age: int. TypedDict provides no runtime validation — it is purely a type hint mechanism for static checkers like mypy and pyright. At runtime, a TypedDict is just a regular dict. Use TypedDict for: API response structures, configuration dicts, function parameters that accept dicts with known shapes. For runtime validation, use Pydantic BaseModel instead. @dataclass is often better than TypedDict for new code since it provides a proper class with attribute access.
PEP 8 is Python's official style guide. Key rules: indentation: 4 spaces (no tabs). Line length: max 79 characters for code, 72 for docstrings. Naming: lower_snake_case for variables/functions, UpperCamelCase for classes, UPPER_SNAKE_CASE for constants, _single_leading for "protected", __double_leading for name mangling. Imports: one per line, grouped (stdlib → third-party → local), alphabetical within groups. Whitespace: spaces around operators, after commas, no spaces inside brackets. Blank lines: 2 between top-level definitions, 1 between methods. Tools: flake8 (linting), black (auto-formatter — enforces a strict, consistent style), isort (import sorter), pylint (comprehensive linter). Modern projects use ruff — an extremely fast linter + formatter written in Rust that replaces flake8, isort, and pyupgrade.