More Data Types: Dictionaries and Sets

Dictionaries

Dictionaries

What is a Dictionary?

An unordered collection of key:value pairs.
A dictionary is like a list, but instead of using numbers (indexes) to access values, it uses unique keys associated with values.
Unlike lists, dictionaries are unordered, meaning the items don’t have a fixed position.
Dictionaries are useful for storing data where quick lookup by a specific key is needed, like a phonebook or a student record.

Dictionary Structure

Dictionary Syntax and Main Characteristics

Creating a dictionary:

                dictionary = {
                    'key1': value 1,
                    'key2': value 2,
                    'keyN': value N
                }
            
Dictionaries are unordered
Although dictionaries preserve insertion order in Python 3.6+ (officially guaranteed from 3.7), relying on this behavior is not recommended.
If you need an ordered dictionary, use OrderedDict from the collections module.
Keys must be immutable data types
Commonly, keys are strings, but they can also be numbers or tuples (as long as they contain only immutable elements).
Keys must be unique
If duplicate keys are defined, only the last occurrence is kept, overwriting any previous value.

Dictionaries - examples


            # Example 1: Basic dictionary usage
            en_bg_dict = {
             'apple': 'ябълка',
             'orange':'портокал',
             'banana':'банан'
            }

            print(en_bg_dict['apple'])
            # ябълка

            ### Example 2: Duplicate keys overwrite previous values
            en_bg_dict = {
                'apple': 'ябълка',
                'orange':'портокал',
                'banana':'банан',
                'apple':'манго',
            }

            print(en_bg_dict['apple'])
            # манго
        

Retrieve item from dictionary

Dictionary values can be retrieved using their respective key in square bracket:

                item = dictionary_name[key]
            
Examples:

                prices = {
                    "apples": 2.50,
                    "oranges": 2.43,
                    "bananas": 3.50
                }

                apples_price = prices['apples']
                print(f'{apples_price:.2f}')
                # 2.50

                oranges_price = prices['oranges']
                print(f'{oranges_price:.2f}')
                # 2.43
            

Change item in a dictionary

To update the value associated with an existing key in a dictionary, assign a new value using the key.

                    dictionary_name[key] = new_value
            
Example: Update the price of apples

                prices = {
                    'apples': 2.50,
                    'oranges': 2.43,
                    'bananas': 3.50
                }

                prices['apples'] = 2.20  # Change value for key 'apples'
                print(prices)

                # Output:
                # {'apples': 2.2, 'oranges': 2.43, 'bananas': 3.5}
            

Add item in a dictionary

Unlike lists, dictionaries allow us to add new items by specifying a key that does not already exist.
This can be done by assigning a value to the new key using:

                dictionary_name[new_key] = new_value
            
Example:

                prices = {
                    "apples": 2.50,
                    "oranges": 2.43,
                    "bananas": 3.50
                }

                ### Add new key-value pair:
                prices['plums'] = 4.30
                print(prices)
                # {'apples': 2.5, 'oranges': 2.43, 'bananas': 3.5, 'plums': 4.3}
            

Delete item from dictionary: del operator

The del operator removes the key-value pair associated with the given key from the dictionary.

                del dictionary[key]
            
If the specified key does not exist, a KeyError will be raised.
Example:

                prices = {
                    "apples": 2.50,
                    "oranges": 2.43,
                    "bananas": 3.50
                }

                # Delete '"oranges": 2.43' pair:
                del prices['oranges']
                print(prices)
                # {'apples': 2.5, 'bananas': 3.5}

                # Try to delete a non-existing key (uncomment to see the error):
                # del prices['plums']
                # KeyError: 'plums'
            

Delete item from dictionary: pop()

Unlike the del operator, pop() removes the key-value pair and returns the value associated with the given key.

                value = dictionary.pop(key[, default])
            
If key is found in the dictionary, it is removed, and its value is returned. If the key is not found, the default value is returned (if provided).
If no key or default is provided, a KeyError will be raised.

                prices = {
                    "apples": 2.50,
                    "oranges": 2.43,
                    "bananas": 3.50
                }

                # Remove 'apples' key-value pair and return its value
                apples_price = prices.pop('apples')
                print(apples_price)
                # Output: 2.5
                print(prices)
                # Output: {'oranges': 2.43, 'bananas': 3.5}

                # Try to remove 'plums' key-value pair from the dictionary. No KeyError raised, as default is provided
                plums_price = prices.pop('plums', None)
                print(plums_price)
                # Output: None
                print(prices)
                # Output: {'oranges': 2.43, 'bananas': 3.5}
            

Get all dictionary keys: dict.keys()

the dict.keys() method returns a dict view object of dictionary keys
The keys in the dict_keys object can be used in loops or converted into a list if needed.
Example: Get dictionary keys as list

                prices = {
                    "apples": 2.50,
                    "oranges": 2.43,
                    "bananas": 3.50
                }

                # Get all the keys in the dictionary as dict_keys view
                keys = prices.keys()

                # Convert dict_keys view to a list to display all the keys
                print(list(keys))
                # Output: ['apples', 'oranges', 'bananas']
            

The dict_keys view object is dynamic!

The view object is dynamic, meaning any changes to the dictionary (like adding or removing keys) will automatically reflect in the dict_keys view.

                prices = {
                    "apples": 2.50,
                    "oranges": 2.43,
                    "bananas": 3.50
                }

                # Get all the keys in the dictionary
                keys = prices.keys()
                print(keys)
                # Output: dict_keys(['apples', 'oranges', 'bananas'])

                # Modify the dictionary by adding a new key-value pair
                prices["plums"] = 4.30

                # The dict_keys view reflects the change automatically
                print(keys)
                # Output: dict_keys(['apples', 'oranges', 'bananas', 'plums'])

                # Remove an item
                del prices["apples"]

                # The dict_keys view reflects the deletion as well
                print(keys)
                # Output: dict_keys(['oranges', 'bananas', 'plums'])
            

Get all dictionary values: dict.values()

the dict.values() method of a dictionary returns a dict_values view object of dictionary values
The keys in the dict_values object can be used in loops or converted into a list if needed.
Example: Get dictionary values as list

                prices = {
                    "apples": 2.50,
                    "oranges": 2.43,
                    "bananas": 3.50
                }

                # Get all the values in the dictionary as dict_values view
                values = prices.values()

                # Convert dict_values view to a list to display all the values
                print(list(values))
                # [2.5, 2.43, 3.5]
            

The dict_values view object is dynamic!


            prices = {
                "apples": 2.50,
                "oranges": 2.43,
                "bananas": 3.50
            }

            values = prices.values()

            # Print initial values
            print("Initial values:", list(values))
            # Initial values: [2.5, 2.43, 3.5]

            # Modify the dictionary by adding a new item
            prices["plums"] = 4.30

            # Print values after modification
            print("Updated values:", list(values))
            # Updated values: [2.5, 2.43, 3.5, 4.3]
        

Get all dictionary items (key-value pairs): dict.items()

the dict.items() method of a dictionary returns a dict view object of dictionary items, i.e. a view of key-value pairs

            prices = {
                "apples": 2.50,
                "oranges": 2.43,
                "bananas": 3.50
            }

            prices_items = prices.items()
            print(prices_items)

            # dict_items([('bananas', 3.5), ('apples', 2.5), ('oranges', 2.43)])
        

items() - dynamic view example


            prices = {
                    "apples": 2.50,
                    "oranges": 2.43,
                    "bananas": 3.50
            }

            # get prices items view:
            prices_items = prices.items()
            print("before:", prices_items)

            # remove an item:
            del prices["oranges"]

            # check if prices_items reflects the change:
            print("after:", prices_items)
        

            before: dict_items([('apples', 2.5), ('oranges', 2.43), ('bananas', 3.5)])
            after: dict_items([('apples', 2.5), ('bananas', 3.5)])
        

Iterate over dictionary keys:

Note that when using for-in loop, dict_name is the same as dict_name.keys()


            ### Variant 1:
            for key in dict_name:
                # do something with a key

            # ### Variant 2:
            for key in dict_name.keys():
                # do something with a key
        

            prices = {
                "apples": 2.50,
                "oranges": 2.43,
                "bananas": 3.50
            }

            for key in prices:
                print(key, end=",")
            # apples,oranges,bananas,

            for key in prices.keys():
                print(key, end=",")
            # apples,oranges,bananas,
        

Iterate over dictionary values:


            for value in dict_name.values():
                # do something with a value
        

            prices = {
                    "apples": 2.50,
                    "oranges": 2.43,
                    "bananas": 3.50
            }

            for v in prices.values():
                print(v)
        

            2.43
            3.5
            2.5
        

Iterate over keys and values

This is the most often used and the most Pythonic construct for looping over dictionaries

            for key, value in dict_name.items():
                # do something with a key
                # do something with a value
        

            prices = {
                "apples": 2.50,
                "oranges": 2.43,
                "bananas": 3.50
            }

            for fruit, price in prices.items():
                print(f'{fruit}={price}')

            # apples=2.5
            # oranges=2.43
            # bananas=3.5
        

More on dictionaries

dict class @python docs
Dictionary view objects @python docs

Useful Readings

Use Cases of Python Dictionaries Data Structure

Sets

Sets

What is a Set?

An unordered collection of unique and immutable objects.
Note that the set itself is a mutable object.
Set data type in Python is an implementation of the sets as they are defined in mathematics.
Sets are commonly used to test membership, remove duplicates from a sequence, and compute mathematical operations such as intersection, union, difference, and symmetric difference.

Syntax


                set = {value1, value2, valueN}
            
Example: set' elements must be immutable:

                set_of_imutables = {"a", 1, "b", 2, 3}
                print(set_of_imutables) #{'a', 'b', 2, 3, 1}

                set_of_mutables = { [1,2,3], ["a", "b"] }
                print(set_of_imutables) #TypeError: unhashable type: 'list'
            
There is no order in a set! We can not say if 'value1' pair will be before 'value2'.
No duplicate items are allowed. If we define a set with duplicate items, duplicates will be ignored!

                int_dup_numbers = {1, 2, 3, 1, 2, 4, 3, 5, 1, 2, 3}

                print(int_dup_numbers)
                # {1, 2, 3, 4, 5, 6}
            

                int_numbers = {1, 2, 3, 4, 5}
                int_dup_numbers = {1, 2, 3, 1, 2, 4, 3, 5, 1, 2, 3}

                print(int_numbers == int_dup_numbers)
                # True
            

Set use case: get unique values from a list using set()


            # from l1 crate l2 with unique values:
            l1 = [1,2,1,3,2,5,3]
            l2 = list(set(l1))
            print(l2)

            # [1, 2, 3, 5]
        

Sets vs Dictionaries

Similarities:
Both are unordered collection of objects (values).
Both are mutable (add/remove/modify elements)
Both use curly braces for their literals.
Differences:
A dictionary is a collection of key:value pairs.
Set is just a collection of immutable objects
A Set can not contain duplicate items! Dictionaries can.

Sets vs Lists

Sets are implemented internally as a Hash Table, so they are much faster than lists for operations concerning searching of an element
Lists are much faster than sets for iteration and random access by element index.

Example: sets are faster for searching elements


            import time

            # Creating a large set and list for demonstration
            large_set = set(range(10_000_000))
            large_list = list(range(10_000_000))

            # Searching for an element in the set
            start_time = time.time()
            exists_in_set = 999999 in large_set     # Searching for a high number
            set_duration = time.time() - start_time

            # Searching for the same element in the list
            start_time = time.time()
            exists_in_list = 999999 in large_list  # Searching for the same high number
            list_duration = time.time() - start_time

            print(f"Time taken for search in set: {set_duration:.8f} seconds")
            print(f"Time taken for search in list: {list_duration:.8f} seconds")

            # Time taken for search in set: 0.00000381 seconds
            # Time taken for search in list: 0.01182604 seconds

        

Example: lists are faster for iteration


            import time

            # Creating a large set and list for demonstration
            large_set = set(range(10_000_000))
            large_list = list(range(10_000_000))

            # Iterating through a list and a set
            start_time = time.time()
            for item in large_list:
                pass  # Simple iteration
            list_iteration_duration = time.time() - start_time

            start_time = time.time()
            for item in large_set:
                pass  # Simple iteration
            set_iteration_duration = time.time() - start_time

            print(f"Time taken for iterating list: {list_iteration_duration:.8f} seconds")
            print(f"Time taken for iterating set: {set_iteration_duration:.8f} seconds")

            # Time taken for iterating list: 0.26020551 seconds
            # Time taken for iterating set: 0.29316807 seconds


        

Basic sets operations

Basic sets operations

Sets Union

set_union.png

Returns new set, which elements are in either sets.

Sets Union - example

Pipe operator | or method union can be used


            set1 = {1, 2, 3, 4}
            set2 = {5, 4}

            union1 = set1 | set2
            union2 = set1.union(set2)
            print(union1)
            print(union2)

            # {1, 2, 3, 4, 5}
            # {1, 2, 3, 4, 5}
        

Sets Intersection

set_intersection.png

Returns new set, which elements belong to both sets.

Sets Intersection - example

Ampersand operator & or method intersection can be used


            set1 = {1, 2, 3, 4}
            set2 = {5, 4}

            intersec1 = set1 & set2
            intersec2 = set1.intersection(set2)
            print(intersec1)
            print(intersec2)

            # {4}
            # {4}
        

Sets Difference

set_difference.png

C = B-A, where C is a new set, which elements are the elements of B, which are not present in A

Sets Difference - example

Operator - or method difference can be used


            set1 = {1, 2, 3, 4, 5}
            set2 = {5, 4}

            dif1 = set1.difference(set2)
            dif2 = set1 - set2
            print(dif1)
            print(dif2)

            # {1, 2, 3}
            # {1, 2, 3}
        

Sets Symmetric Difference

C = A △ B, where C is a new set, which elements are either in sets A or B but not in both.

Sets Symmetric Difference - example

Operator ^ or method symmetric_difference can be used


            set1 = {1, 2, 3, 4}
            set2 = {5, 4}

            sym_dif = set1.symmetric_difference(set2)
            print(sym_dif)

            # {1, 2, 3, 5}
        

More on sets

Python Sets Methods: A Practical Guide with Examples

Exercises

Task: best students scores

the task

./Tasks_and_HW/images/task1_student_with_high_score.png
Write in file named task_best_students_scores.py
Represent the information given in student_scores table in appropriate data structure.
From student_scores data, create a new data structure named best_students_scores, storing the information (name and score) only for students with scores greater than 4.00
Print out the names and scores from best_students_scores as shown:

                Ivan     - 5.00
                Maria    - 5.50
                Georgy   - 5.00
            

Task: min max student score

the task

Write in file named task_min_max_student_score.py
Using the same information as in previous task, i.e.

Print out the name and score of the student with maximum score
Print out the name and score of the student with minimum score

            Maria - 5.5
            Alex - 3.5
        

hints

To find a min/max value from values, you can use Python's built-in functions min() and max()
You can use them on any sequence, like this:

                numbers = [2.50, 2.43, 3.50]

                # Get max value in list:
                print( max(numbers) )
            
Or on dictionary, like this:

                prices = {
                    "apples": 2.50,
                    "oranges": 2.43,
                    "bananas": 3.50
                }

                # Get max value in a dict
                print( max(prices.values()) )

                # Get the key of max value in a dict
                print( max(prices, key=prices.get) )
            

Task: simple word index

the task

Write in file named task_simple_word_index.py
Make a program which will counts how many times a word appears in given text

            text = """apple and banana one apple one banana
                      a red apple and a green apple"""
        

            # not nessesarly in the same order
            apple  - 4
            and    - 2
            banana - 2
            one    - 2
            a      - 2
            red    - 1
            green  - 1
        

Hints

Hint 1: for simplicity we will get the "words" as a space separated sequences in string. To get that list of words we can use the str.split() method, as shown:

                text = "some words delimited by spaces"
                words_list = text.split()

                print(words_list)

                # ['some', 'words', 'delimited', 'by', 'spaces']
            
Hint 2: you can use the list.count(x) method which return the number of times x appears in the list.

                l = [1,1,1,2,2,2,2,3,3]

                print(l.count(1)) # 3
                print(l.count(2)) # 4
                print(l.count(3)) # 2
            

Task: job candidate skills

the task

Write in file named job_candidate_skills.py
Given:
A job requires next skills: 'Python', 'Django', 'SQL', 'Git'
A candidate has next skills: 'Python', 'Flask', 'Git', 'JavaScript'
Your task:
Find and print the skills the candidate matches ('Python', 'Git').
Find and print the skills the candidate is missing ('SQL', 'Django').
Find and print the extra skills the candidate has ('Flask', 'JavaScript').
Expected output:

                Matched Skills: Python, Git
                Missing Skills: SQL, Django
                Extra Skills: Flask, JavaScript
            

Hints

Use sets as this data structure is most appropriate for this task
To stringify a sequence or set, you can use the str.join() method. It joins elements of an iterable (like a list or set) into a single string, with a specified separator between them:

                skills = {'Python', 'Git'}
                print(f'Skills: {", ".join(skills)}')
            

These slides are based on

customised version of

Hakimel's reveal.js

framework