A set is an unordered collection of zero or more hashable objects. All the built-in immutable data types, such as float, int, str, and tuple, are hashable and can be added to sets. The built-in mutable data types, such as dict, list, and set, are not hashable because their hash value changes when the items they contain change. Sets are mutable, so we can easily add or remove items, but since they are unordered they have no notion of index position and so cannot be sliced or strided.
Examples of Sets:
#Emtpy set s = set() # Set can have different types of items s = {8, "OK", (3,4), 0.4} #The following 3 Sets are the same: s1 = set("apple") s2 = set("aple") s3 = {'a', 'p', 'l', 'e'} print(s1) print(s2) print(s3) # Convert a list to set: # This is also a quick way to get distinct elements in a list l = [1, 2, 2, 4, 4, 5] s = set(l) print(s)
import string def get_distinct_words(s): #special chars to be removed: special_chars = string.whitespace+string.punctuation+string.digits+"\"'" #split it to a list based on space word_list = s.split() #clean the word list clean_word_list = [word.strip(special_chars).lower() for word in word_list] return set(clean_word_list) s=""" A set is an unordered collection of zero or more hashable objects. All the built-in immutable data types, such as float, int, str, and tuple, are hashable and can be added to sets. The built-in mutable data types, such as dict, list, and set, are not hashable because their hash value changes when the items they contain change. Sets are mutable, so we can easily add or remove items, but since they are unordered they have no notion of index position and so cannot be sliced or strided.""" print(get_distinct_words(s))
Syntax | Description |
---|---|
s.add(x) | Adds item x to set s if it is not already in s |
s.clear() | Removes all the items from set s |
s.copy() | Returns a shallow copy of set |
s.difference(t) s - t | Returns a new set that has every item that is in set s that is not in set t |
s.difference_update(t) s -= t | Removes every item that is in set t from set s |
s.discard(x) | Removes item x from set s if it is in s; see also set.remove() |
s.intersection(t) s & t | Returns a new set that has each item that is in both set s and set t |
s.intersection_update(t) s &= t | Makes set s contain the intersection of itself and set t |
s.isdisjoint(t) | Returns True if sets s and t have no items in common |
s.issubset(t) s <= t | Returns True if set s is equal to or a subset of set t; use s < t to test whether s is a proper subset of t |
s.issuperset(t) s >= t | Returns True if set s is equal to or a superset of set t; use s > t to test whether s is a proper superset of t |
s.pop() | Returns and removes a random item from set s, or raises a KeyError exception if s is empty |
s.remove(x) | Removes item x from set s, or raises a KeyError exception if x is not in s; see also set.discard() |
s.symmetric_
difference(t) s ^ t | Returns a new set that has every item that is in set s and every item that is in set t, but excluding items that are in both sets |
s.symmetric_
difference_update(t) s ^= t | Makes set s contain the symmetric difference of itself and set t |
s.union(t) s | t | Returns a new set that has all the items in set s and all the items in set t that are not in set s |
s.update(t) s |= t | Adds every item in set t that is not in set s, to set s |
Similar to List comprehension, a set comprehension is an expression and a loop with an optional condition enclosed in braces.
#{expression for item in iterable} #{expression for item in iterable if condition} from os import listdir from os.path import isfile, join #Get a set of files in a directory def get_files(dir_path): return {f for f in listdir(dir_path) if isfile(join(dir_path,f))} print(get_files("c:\\Proj\\"))
# get a set of all possible permutations for n elements in list of data def permutation(data, n): if n<1: return [] if n > len(data): return [] if len(data) == 1 or n == 1: return [[a,] for a in data] retlist = [] for a in permutation(data[1:], n-1): for i in range(len(a)+1): b = a.copy() b.insert(i, data[0]) retlist.append(b) if len(data) > n: retlist.extend(permutation(data[1:], n)) return retlist def combination(data, n): # turn the list from permutation into set and then use set to get unique set # because set can not contain a set, we must frozenset here. return {frozenset(a) for a in permutation(data, n)} data = [i for i in range(1,11)] print (permutation(data, 2)) print (combination(data, n))