2.2 notes

It describes data compression and how it is used to reduce the number of bits needed to represent data, save transmission time, and storage space.
It distinguishes between lossy and lossless data compression.
The message highlights the Image Lab Project’s data concepts, which involve compression and analyzing size.
It explains the meta-data and considerations when managing image files, including file type, size, height, width, number of pixels, visual perception, displaying images in Python Jupyter notebooks, and using Python libraries and concepts.
It asks questions related to accessing files in terminal, working with different operating systems, and observations, struggles, or learnings while working with the code.
The message examines why the path is a big deal when working with images and how the metadata source and label relate to Unit 5 topics.
It describes IPython and why it is interesting in Jupyter Notebooks for both Pandas and images.
The message outlines different ways to read and encode images, including using PIL, base64, and numpy.
It explains the purpose and results of a program that creates meta-data and manipulates images.
It asks questions about the Grey Scale algorithm, scaling images, and how these topics relate to data compression.

2.3 notes

The dataset contains missing data points and invalid data.
One data point is inaccurate, where “Junior” is entered instead of a number.
It’s crucial to clean the data before performing any analysis, as garbage in will result in garbage out. Extracting Info:
We used Pandas’ DataFrame to extract information from the dataset.
We extracted one column and two columns while removing the index from the print statement.
This allows us to focus on the specific information we need for our analysis.

We used sort_values() to sort the DataFrame by the GPA column.
The DataFrame is sorted both in ascending and descending order.
Sorting helps to make sense of the data and allows us to draw meaningful insights.

We used conditional statements to select the maximum and minimum GPA in the DataFrame.
The output showed the data of students who scored the highest and lowest GPA.
Selecting the maximum and minimum values helps us to identify the highest and lowest achievers.

We created our own DataFrame using a Python dictionary.
We used pd.DataFrame(dict) to convert the dictionary to a DataFrame.
Creating our own DataFrame allows us to work with data that we are interested in.