2.2 notes

  • It describes data compression and how it is used to reduce the number of bits needed to represent data, save transmission time, and storage space.
  • It distinguishes between lossy and lossless data compression.
  • The message highlights the Image Lab Project’s data concepts, which involve compression and analyzing size.
  • It explains the meta-data and considerations when managing image files, including file type, size, height, width, number of pixels, visual perception, displaying images in Python Jupyter notebooks, and using Python libraries and concepts.
  • It asks questions related to accessing files in terminal, working with different operating systems, and observations, struggles, or learnings while working with the code.
  • The message examines why the path is a big deal when working with images and how the metadata source and label relate to Unit 5 topics.
  • It describes IPython and why it is interesting in Jupyter Notebooks for both Pandas and images.
  • The message outlines different ways to read and encode images, including using PIL, base64, and numpy.
  • It explains the purpose and results of a program that creates meta-data and manipulates images.
  • It asks questions about the Grey Scale algorithm, scaling images, and how these topics relate to data compression.

2.3 notes

Data Cleaning:

  • The dataset contains missing data points and invalid data.
  • One data point is inaccurate, where “Junior” is entered instead of a number.
  • It’s crucial to clean the data before performing any analysis, as garbage in will result in garbage out. Extracting Info:

  • We used Pandas’ DataFrame to extract information from the dataset.
  • We extracted one column and two columns while removing the index from the print statement.
  • This allows us to focus on the specific information we need for our analysis.

DataFrame Sort:

  • We used sort_values() to sort the DataFrame by the GPA column.
  • The DataFrame is sorted both in ascending and descending order.
  • Sorting helps to make sense of the data and allows us to draw meaningful insights.

DataFrame Selection or Filter:

  • We used conditional statements to select data from the DataFrame.
  • We filtered out all data where GPA is less than or equal to 3.00.
  • Filtering helps us to select the specific data we need for our analysis.

DataFrame Selection Max and Min:

  • We used conditional statements to select the maximum and minimum GPA in the DataFrame.
  • The output showed the data of students who scored the highest and lowest GPA.
  • Selecting the maximum and minimum values helps us to identify the highest and lowest achievers.

Creating Own DataFrame:

  • We created our own DataFrame using a Python dictionary.
  • We used pd.DataFrame(dict) to convert the dictionary to a DataFrame.
  • Creating our own DataFrame allows us to work with data that we are interested in.