Data Compression

Help Questions

AP Computer Science Principles › Data Compression

Questions 1 - 10
1

Read the passage. A media company publishes articles, photos, and videos and uses different compression methods depending on content. Compression reduces the number of bits needed by encoding patterns more efficiently, helping files download faster and cost less to store. Lossless compression preserves exact data and is favored for text, where accuracy matters, while lossy compression discards some information to achieve smaller files and is common for images and video. Huffman coding and LZW are lossless techniques often used for text or general data, while JPEG is a popular lossy format for still images. MPEG is widely used for video compression, often balancing quality against file size. Selecting the right method depends on whether perfect fidelity or smaller size is the priority.

Based on the text, how does Huffman coding differ from JPEG compression?

Huffman is a video standard; JPEG compresses text files.

Huffman increases size; JPEG always eliminates artifacts entirely.

Huffman discards details; JPEG preserves every original bit.

Huffman is lossless for symbols; JPEG is lossy for images.

Explanation

This question tests AP Computer Science Principles skills in understanding data compression, specifically comparing the fundamental differences between Huffman coding and JPEG compression. Data compression uses different methods for different content types, with Huffman coding being a lossless technique for text and symbols, while JPEG is a lossy format designed specifically for images. In the provided passage, Huffman coding is described as a 'lossless technique often used for text,' while JPEG is identified as 'a popular lossy format for still images.' Choice A is correct because it accurately captures this fundamental distinction: Huffman is lossless and works with symbols/text, while JPEG is lossy and designed for images, as explicitly stated in the passage. Choice C is incorrect because it reverses their characteristics - Huffman preserves data (lossless) while JPEG discards details (lossy). To help students: Create comparison tables showing compression methods grouped by lossless vs. lossy and their typical applications. Emphasize that the choice of compression method depends on both the content type and whether perfect accuracy is required.

2

Read the passage. A cloud platform stores millions of customer documents and wants to reduce storage costs without changing any file contents. Data compression helps by representing repeated patterns more efficiently, lowering the number of bits needed to store data. Lossless compression is used when exact recovery matters, such as for text files, code, and spreadsheets, while lossy compression is common for media where small imperfections are acceptable. LZW is a lossless algorithm that builds a dictionary of repeated sequences and substitutes short codes, which can be effective when the same phrases or patterns appear many times. Huffman coding is also lossless, using shorter bit patterns for more frequent symbols. JPEG and MPEG, by contrast, are typically lossy and focus on shrinking images and video.

Which scenario best demonstrates the use of LZW compression?

Shrinking a photo by discarding subtle visual details.

Reducing video size by predicting changes between frames.

Compressing repetitive server logs without altering any characters.

Improving audio clarity by adding extra data to the stream.

Explanation

This question tests AP Computer Science Principles skills in understanding data compression, specifically identifying appropriate use cases for LZW compression. Data compression reduces file sizes by finding and encoding patterns, with LZW being a lossless method that builds a dictionary of repeated sequences and replaces them with shorter codes. In the provided passage, LZW is described as 'a lossless algorithm that builds a dictionary of repeated sequences and substitutes short codes, which can be effective when the same phrases or patterns appear many times.' Choice A is correct because server logs typically contain repetitive patterns (timestamps, IP addresses, error messages) that LZW can compress effectively without altering any characters, matching the lossless requirement. Choice B is incorrect because it describes lossy compression (discarding visual details), while LZW is explicitly lossless. To help students: Provide examples of repetitive data (logs, source code, structured documents) where LZW excels. Emphasize that LZW's dictionary approach works best with repeated patterns.

3

Read the passage. Streaming services rely on compression to deliver content efficiently across networks with varying speeds. Compression reduces the number of bits sent, which can prevent buffering and lower data usage. Lossless compression preserves exact data and is common for text like subtitles, while lossy compression is common for images and video because it can shrink files much more by discarding information viewers are less likely to notice. JPEG is a familiar lossy format for still images, and MPEG is widely used for video, often combining multiple strategies to reduce size. The main challenge is balancing smaller files against visible artifacts or reduced clarity. Services tune compression levels to maintain acceptable quality while keeping playback smooth.

Why is lossy compression preferred in many streaming video applications?

It ensures every frame is restored with perfect accuracy.

It is required for subtitles because text can change safely.

It achieves much smaller files, with tolerable quality loss.

It works by assigning shorter codes to frequent letters only.

Explanation

This question tests AP Computer Science Principles skills in understanding data compression, specifically why lossy compression is preferred for streaming video applications. Data compression reduces file sizes to enable efficient streaming, with lossy compression achieving much greater reduction by removing information viewers are unlikely to notice. In the provided passage, lossy compression for video is described as being able to 'shrink files much more by discarding information viewers are less likely to notice,' with the key benefit being prevention of buffering and smooth playback. Choice A is correct because it captures the essential trade-off: achieving much smaller files (enabling smooth streaming) with tolerable quality loss, as stated in the passage's discussion of balancing file size against artifacts. Choice B is incorrect because it describes lossless compression - the passage clearly states lossy compression discards information. To help students: Discuss real-world streaming constraints like bandwidth limitations and why perfect quality isn't always necessary. Use examples of quality settings in streaming platforms to illustrate the compression trade-offs.

4

Read the passage. In a streaming-service setting, data compression reduces the number of bits needed to store or transmit information. Compression works by finding patterns and representing them more efficiently, which helps platforms deliver movies, music, and captions quickly over limited bandwidth. Two broad categories appear: lossless compression preserves every original detail so the decompressed data matches exactly, while lossy compression removes some information to achieve smaller files. For text such as subtitles and chat logs, lossless methods are common because even a small change can alter meaning; Huffman coding assigns shorter bit patterns to frequent symbols, and LZW replaces repeated sequences with short codes. For images and video, lossy methods often dominate because the human eye tolerates small changes; JPEG compresses still images by discarding subtle visual details, and MPEG compresses video by combining spatial compression with frame-to-frame prediction. The trade-off is constant: stronger compression usually means smaller files but more noticeable artifacts, so services adjust settings to balance quality and smooth playback.

Based on the text, what trade-offs are involved in using JPEG compression?

It preserves every pixel while greatly shrinking files.

It increases file size to prevent any streaming delays.

It reduces file size but may introduce visible artifacts.

It replaces repeated words with codes for perfect text accuracy.

Explanation

This question tests AP Computer Science Principles skills in understanding data compression, specifically the trade-offs involved in lossy compression methods like JPEG. Data compression reduces file sizes by removing redundant information, with lossy compression achieving greater reduction by permanently discarding some data that may not be noticeable to users. In the provided passage, JPEG compression is described as discarding 'subtle visual details' to compress still images, while noting that 'stronger compression usually means smaller files but more noticeable artifacts.' Choice A is correct because it accurately captures both aspects of the JPEG trade-off: reduced file size (benefit) and potential visible artifacts (cost) as stated in the passage. Choice B is incorrect because it contradicts the lossy nature of JPEG - the passage clearly states JPEG 'discards subtle visual details,' not preserving every pixel. To help students: Focus on identifying key terms like 'trade-off' and 'artifacts' in compression contexts. Practice distinguishing between the benefits (smaller files) and costs (quality loss) of different compression methods.

5

Read the passage. A messaging app compresses text to save bandwidth and storage while keeping messages readable and unchanged. Compression reduces file size by encoding common patterns with fewer bits, and for text this must be lossless so every character is restored exactly. Huffman coding is one lossless technique that assigns shorter bit patterns to characters that appear more often, which can shrink many natural-language messages. LZW is another lossless method that replaces repeated sequences with short codes from a growing dictionary, which can be effective for repetitive logs or structured text. In contrast, JPEG and MPEG are typically lossy and are chosen for images and video where small inaccuracies are acceptable. The app therefore favors lossless algorithms for text to avoid altering meaning.

Based on the text, what is the primary benefit of using Huffman coding?

It increases file size to preserve network reliability.

It assigns shorter codes to frequent symbols to reduce size.

It removes subtle pixels to improve photo realism.

It converts text into MPEG for smoother playback.

Explanation

This question tests AP Computer Science Principles skills in understanding data compression, specifically how Huffman coding works as a lossless compression technique. Data compression reduces file sizes by encoding information more efficiently, with Huffman coding being a lossless method that assigns variable-length codes based on symbol frequency. In the provided passage, Huffman coding is described as 'one lossless technique that assigns shorter bit patterns to characters that appear more often, which can shrink many natural-language messages.' Choice A is correct because it accurately describes Huffman coding's primary benefit: assigning shorter codes to frequent symbols to reduce size, exactly as stated in the passage. Choice B is incorrect because it describes lossy compression behavior (removing pixels), while Huffman coding is explicitly identified as lossless in the passage. To help students: Use frequency analysis exercises to demonstrate how Huffman coding works. Show examples of common letters (like 'e' in English) getting shorter codes than rare letters (like 'q').

6

Read the passage. A school archives student essays and wants to reduce file sizes while preserving exact wording. Data compression helps by finding patterns and encoding them with fewer bits, which saves storage and speeds transfers. For text, lossless compression is essential because changing even one character can alter meaning or grading. Huffman coding is a lossless method that compresses by giving frequent characters shorter codes, and LZW is another lossless method that replaces repeated sequences with dictionary codes. JPEG and MPEG are usually lossy and focus on media like images and video, where small changes may be acceptable. The archive therefore chooses lossless methods to maintain integrity.

What is the primary benefit of using lossless compression for archived essays?

It ensures the restored text matches the original exactly.

It improves image sharpness by discarding extra pixels.

It always produces the smallest possible file size.

It converts documents into MPEG for easier editing.

Explanation

This question tests AP Computer Science Principles skills in understanding data compression, specifically why lossless compression is essential for text archival. Data compression reduces file sizes, but when archiving text documents like essays, maintaining exact accuracy is crucial because even small changes can alter meaning or affect grading. In the provided passage, the text explicitly states that 'For text, lossless compression is essential because changing even one character can alter meaning or grading.' Choice A is correct because it identifies the primary benefit: ensuring the restored text matches the original exactly, which is critical for maintaining the integrity of archived essays. Choice B is incorrect because lossless compression doesn't always produce the smallest files - lossy compression typically achieves greater reduction but at the cost of data loss. To help students: Use examples of how a single character change can alter meaning (e.g., 'not' vs 'now'). Emphasize that academic integrity requires perfect preservation of student work.

7

Based on the text, why is lossy compression preferred in many digital photos?

It reduces file size by discarding minor details people may not notice

It converts photos into text so they compress like documents

It avoids trade-offs by improving quality as size decreases

It keeps every pixel identical, ensuring perfect restoration

Explanation

This question tests AP Computer Science Principles skills in understanding data compression, specifically why lossy compression is often preferred for digital photography despite quality loss. Digital photos contain enormous amounts of data, much of which represents subtle variations in color and detail that human eyes cannot easily distinguish. Lossy compression exploits human perceptual limitations by discarding these less noticeable details, achieving significant file size reductions that make photo storage and sharing practical. Choice B is correct because it explains why lossy compression is preferred: it reduces file size by removing minor details that people may not notice, making photos more manageable without significantly impacting perceived quality. Choice D is incorrect because it presents an impossible scenario - compression always involves trade-offs, and no method can improve quality while decreasing size. To help students: Demonstrate with high-resolution photos how much space would be needed without compression, then show how lossy compression makes digital photography practical. Discuss how the acceptable compression level depends on the photo's intended use.

8

Based on the text, how does lossless compression differ from lossy compression?

Lossless preserves all information; lossy may remove some detail

Lossless always makes files larger; lossy always makes them smaller

Lossless discards details; lossy preserves every original bit

Lossless works only for video; lossy works only for text

Explanation

This question tests AP Computer Science Principles skills in understanding data compression, specifically differentiating between lossless and lossy compression methods. Lossless compression reduces file size by removing redundancy without losing any original information, while lossy compression achieves greater size reduction by permanently discarding some data deemed less important. The fundamental distinction is that lossless compression allows perfect reconstruction of the original data, whereas lossy compression results in an approximation of the original. Choice C is correct because it accurately states that lossless preserves all information while lossy may remove some detail, capturing the essential difference between these compression types. Choice A is incorrect because it reverses the definitions - lossless preserves details while lossy discards them, not the other way around. To help students: Create visual comparisons showing how the same file looks after lossless versus lossy compression. Emphasize that the choice between methods depends on whether perfect accuracy or smaller file size is more important for the specific use case.

9

Read the passage. During backup storage, organizations compress data to reduce the amount of disk space required and to speed up copying large archives. Compression works by encoding information more efficiently, often by finding repeated patterns across files. In this context, lossless compression is preferred because backups must restore data exactly, including program files, spreadsheets, and databases. Huffman coding supports lossless compression by assigning shorter codes to common symbols, while LZW builds a dictionary of repeated sequences so the same patterns can be stored once and referenced many times. Lossy compression, used in formats like JPEG for images and MPEG for video, can shrink files further by discarding subtle details, but that risk is unacceptable for most backups. As a result, backup systems typically prioritize data integrity over maximum size reduction.

Why is lossless compression preferred in backup storage?

It intentionally removes details to maximize shrinkage.

It guarantees exact restoration of the original data.

It works only for photos, not for documents.

It converts files into MPEG to speed up recovery.

Explanation

This question tests AP Computer Science Principles skills in understanding data compression, specifically why lossless compression is critical for certain applications like backup storage. Data compression reduces file sizes, but the choice between lossless (preserving all data) and lossy (discarding some data) depends on the application's requirements for data integrity. In the provided passage, backup storage is described as requiring lossless compression because 'backups must restore data exactly, including program files, spreadsheets, and databases.' Choice A is correct because it identifies the key requirement: exact restoration of original data, which is explicitly stated as the reason lossless compression is preferred for backups. Choice B is incorrect because it describes lossy compression's behavior (removing details), which the passage states is 'unacceptable for most backups.' To help students: Use real-world scenarios to illustrate when data integrity is critical versus when some loss is acceptable. Emphasize that backups serve as insurance policies - they must perfectly restore data when needed.

10

Read the passage. In a text-focused workflow, data compression helps reduce storage and speed up transfers without changing the meaning of documents. Compression works by representing information with fewer bits, often by exploiting repetition or predictable patterns. Lossless compression is essential for text because decompressed output must match the original exactly, while lossy compression intentionally discards some information and is better suited to media where small changes are acceptable. Huffman coding is a common lossless approach that uses shorter bit patterns for more frequent characters, shrinking many text files efficiently. LZW is another lossless method that builds a dictionary of repeated sequences and replaces those sequences with short codes, which can work well for logs and repetitive data. By contrast, JPEG and MPEG are widely used for images and video, where some quality loss is tolerated to achieve much smaller files.

Based on the text, how does lossless compression differ from lossy compression?

Lossless exactly preserves data; lossy discards some information.

Lossless applies only to video; lossy applies only to text.

Lossless removes details; lossy preserves every original bit.

Lossless always makes files larger; lossy always makes them smaller.

Explanation

This question tests AP Computer Science Principles skills in understanding data compression, specifically differentiating between lossless and lossy compression methods. Data compression reduces file sizes by encoding information more efficiently, with lossless compression preserving all original data for perfect reconstruction, while lossy compression permanently removes some information to achieve smaller file sizes. In the provided passage, the text explicitly states that 'Lossless compression is essential for text because decompressed output must match the original exactly, while lossy compression intentionally discards some information.' Choice B is correct because it accurately reflects this fundamental distinction: lossless preserves data exactly while lossy discards information, as directly stated in the passage. Choice A is incorrect because it reverses the definitions - lossless preserves, not removes details. To help students: Create comparison charts showing lossless vs. lossy characteristics and their typical applications. Emphasize that 'lossless' means 'no loss' of data, making it essential for text where accuracy matters.

Page 1 of 2