Website Portfolio
Portfolio website to showcase my work and projects in and outside STEM.
Project Overview
This project is a lightweight BitTorrent client written in C++23, designed specifically to download single-file torrents from YTS.mx. It implements core BitTorrent protocol features, including torrent file parsing, tracker communication, peer-to-peer file downloading, and piece verification. The client supports essential commands (info, peers, download_piece, download) to interact with torrents, making it a practical tool for understanding and demonstrating BitTorrent technology.
The project was developed as part of a learning exercise to master network programming, file handling, and protocol implementation in C++. It’s optimized for YTS.mx torrents, which typically use single-file structures and HTTP/HTTPS trackers, ensuring reliability and simplicity. This documentation explains every component, from code structure to algorithms, and reflects the concepts learned during development.
Project Structure
The codebase consists of a single source file, main.cpp, organized into logical sections with clear comments, alongside configuration files for building and running the project:
- main.cpp: The core implementation, containing all functionality for bencoding, tracker communication, peer interactions, and command-line interface.
- CMakeLists.txt: CMake configuration for building the project with dependencies (CURL, OpenSSL, nlohmann/json).
- your_program.sh: Shell script for local compilation and execution, using
vcpkgfor dependency management. - lib/nlohmann/json.hpp: External JSON library for parsing bencoded data.
The project uses a single-file approach for simplicity, with main.cpp divided into five commented sections:
- Includes and Dependencies: External libraries and standard C++ headers.
- Bencoding and Decoding Functions: Parsing and encoding BitTorrent’s bencode format.
- Tracker Utilities: Functions for communicating with HTTP/HTTPS trackers.
- Peer Communication: Logic for downloading pieces from peers.
- Main Command Logic: Command-line interface for user interaction.
Detailed Code Explanation
1. Includes and Dependencies
This section imports necessary libraries and defines the json alias for nlohmann::json.
-
Headers:
nlohmann/json.hpp: Parses bencoded data into JSON for easy manipulation.<curl/curl.h>: Handles HTTP requests to trackers.<openssl/sha.h>: Computes SHA-1 hashes for info hash and piece verification.<arpa/inet.h>,<netinet/in.h>,<sys/socket.h>,<fcntl.h>,<sys/select.h>: Network programming for peer connections.<fstream>,<iostream>,<sstream>: File and console I/O.<random>: Generates random peer IDs.<string>,<vector>: Standard C++ utilities.<unistd.h>: POSIX functions for socket operations.
-
Purpose: Provides the foundation for network communication, cryptographic hashing, JSON parsing, and file handling, tailored to BitTorrent’s requirements.
2. Bencoding and Decoding Functions
BitTorrent uses bencode, a simple encoding format for strings, integers, lists, and dictionaries. This section implements parsing and encoding logic.
-
decode_bencoded_value(const std::string &encoded_value, size_t &pos):- Purpose: Recursively decodes a bencoded string, updating the position (
pos) in the input string. - Logic:
- Handles four bencode types:
- Strings:
<length>:<contents>(e.g.,4:spam→"spam"). - Integers:
i<number>e(e.g.,i42e→42). - Lists:
l<elements>e(e.g.,li1ei2ee→[1, 2]). - Dictionaries:
d<key-value pairs>e(e.g.,d3:foo3:bare→{"foo": "bar"}).
- Strings:
- Uses
std::stollfor number parsing and throws exceptions for invalid formats. - Returns a
nlohmann::jsonobject representing the decoded value.
- Handles four bencode types:
- Key Features:
- Robust error handling for malformed bencode (e.g., missing
e, invalid lengths). - Recursive parsing for nested structures.
- Position tracking to process the entire string.
- Robust error handling for malformed bencode (e.g., missing
- Purpose: Recursively decodes a bencoded string, updating the position (
-
decode_bencoded_value(const std::string &encoded_value):- Purpose: Wrapper function to decode an entire bencoded string.
- Logic: Calls the recursive function starting at
pos = 0and verifies that the entire string is consumed. - Use Case: Parses
.torrentfiles and tracker responses.
-
bencode(const json &value):- Purpose: Encodes a JSON value into bencode format.
- Logic:
- Converts JSON types (strings, integers, arrays, objects) to bencode.
- Sorts dictionary keys for consistent encoding (required for info hash).
- Uses recursion for nested structures.
- Use Case: Generates bencoded
infodictionary for SHA-1 hashing.
-
Why It Matters: Bencoding is the backbone of BitTorrent’s data format, used in
.torrentfiles and tracker communication. These functions enable the client to read and process torrent metadata and responses.
3. Tracker Utilities
This section handles communication with BitTorrent trackers to obtain peer lists.
-
url_encode_info_hash(const unsigned char *hash, size_t length):- Purpose: Converts a 20-byte SHA-1 hash (e.g., info hash) into a URL-encoded string.
- Logic: Formats each byte as a
%XXhexadecimal value (e.g., byte0x1A→%1A). - Use Case: Required for tracker HTTP requests (e.g.,
?info_hash=<encoded_hash>).
-
write_callback(void *contents, size_t size, size_t nmemb, void *userp):- Purpose: CURL callback to capture HTTP response data.
- Logic: Appends response bytes to a
std::stringbuffer. - Use Case: Collects tracker responses containing peer lists.
-
select_tracker_url(const json &torrent):- Purpose: Selects an HTTP/HTTPS tracker URL from the torrent’s
announceorannounce-list. - Logic:
- Prefers
announce-list(an array of arrays, e.g.,[["http://tracker1"], ["http://tracker2"]]), common in YTS.mx torrents. - Falls back to
announce(a single string, e.g.,"http://tracker"). - Validates URLs start with
http://orhttps://. - Throws an error if no valid tracker is found.
- Prefers
- Why It Matters: YTS.mx torrents often use multiple trackers in
announce-list, and this function ensures robust tracker selection.
- Purpose: Selects an HTTP/HTTPS tracker URL from the torrent’s
-
Key Features:
- Uses CURL for HTTP requests, ensuring compatibility with YTS.mx’s HTTP/HTTPS trackers.
- Handles compact peer lists (
&compact=1) for efficiency. - Robust error handling for failed requests or invalid responses.
4. Peer Communication
This section implements peer-to-peer communication to download torrent pieces.
-
exchange_peer_messages(const std::string &saved_path, const std::string &info_hash, const std::pair<std::string, uint16_t> &peer, int piece_index, int piece_length, const std::string &pieces):- Purpose: Connects to a peer, performs a handshake, and downloads a specified piece.
- Logic:
- Socket Setup:
- Creates a TCP socket (
AF_INET,SOCK_STREAM). - Sets non-blocking mode and timeouts (10s for I/O, 5s for connection) using
fcntlandsetsockopt. - Connects to the peer’s IP and port, handling
EINPROGRESSfor asynchronous connections.
- Creates a TCP socket (
- Handshake:
- Sends a 68-byte BitTorrent handshake: protocol identifier (
19:BitTorrent protocol), reserved bytes, info hash, and random peer ID. - Receives and validates the peer’s handshake response.
- Sends a 68-byte BitTorrent handshake: protocol identifier (
- Protocol Messages:
- Receives a bitfield message (indicating available pieces).
- Sends an
interestedmessage (ID 2) to express interest. - Waits for an
unchokemessage (ID 1) from the peer. - Requests piece blocks (16KB each) using
requestmessages (ID 6). - Receives
piecemessages (ID 7) and assembles the piece.
- Piece Verification:
- Computes the SHA-1 hash of the downloaded piece.
- Compares it with the expected hash from
pieces(20-byte segments). - Writes the verified piece to
saved_path.
- Socket Setup:
- Key Features:
- Non-blocking sockets with
selectfor robust connection handling. - Random peer ID generation using
<random>. - Error handling for connection failures, invalid responses, or hash mismatches.
- Efficient block-based downloading (16KB chunks).
- Non-blocking sockets with
-
Why It Matters: Peer communication is the core of BitTorrent’s decentralized file sharing. This function implements the protocol’s handshake and message exchange, enabling file downloads from peers.
5. Main Command Logic
This section defines the command-line interface and orchestrates the client’s functionality.
-
main(int argc, char *argv[]):- Purpose: Parses command-line arguments and executes one of four commands:
info,peers,download_piece,download. - Logic:
- Enables unbuffered I/O for immediate console output.
- Validates the command and arguments.
- Reads and decodes the
.torrentfile into a JSON object. - Executes the specified command.
- Purpose: Parses command-line arguments and executes one of four commands:
-
Commands:
-
info <torrent_file>:- Purpose: Displays torrent metadata.
- Output: Tracker URL, file length, info hash, piece length, and piece hashes.
- Logic:
- Validates
infofields (length,piece length,pieces). - Computes the info hash (SHA-1 of bencoded
info). - Formats piece hashes as hexadecimal strings.
- Validates
- Use Case: Debugging and inspecting torrent files.
-
peers <torrent_file>:- Purpose: Lists peers available for downloading.
- Output: IP:port pairs (e.g.,
192.168.1.1:6881). - Logic:
- Constructs a tracker request with URL-encoded info hash, peer ID, and parameters (
port=6881,compact=1). - Sends the request via CURL and decodes the bencoded response.
- Parses the compact peer list (6-byte entries: 4-byte IP, 2-byte port).
- Validates
peersas a string to prevent JSON type errors.
- Constructs a tracker request with URL-encoded info hash, peer ID, and parameters (
- Use Case: Identifying available peers.
-
download_piece -o <save_path> <torrent_file> <piece_index>:- Purpose: Downloads a single piece and saves it to
save_path. - Logic:
- Fetches peers from the tracker.
- Computes the piece length (adjusts for the last piece if smaller).
- Tries peers sequentially until the piece is downloaded.
- Calls
exchange_peer_messagesto download and verify the piece.
- Use Case: Testing piece downloading or partial downloads.
- Purpose: Downloads a single piece and saves it to
-
download -o <output_file> <torrent_file>:- Purpose: Downloads the entire file and saves it to
output_file. - Logic:
- Fetches peers from the tracker.
- Iterates over all pieces, downloading each via
exchange_peer_messages. - Saves pieces to temporary files (
/tmp/piece_<index>), then assembles them intocomplete_file. - Writes the final file after all pieces are verified.
- Limits peer attempts to 3 per piece for efficiency.
- Use Case: Primary function for downloading YTS.mx torrents.
- Purpose: Downloads the entire file and saves it to
-
-
Key Features:
- Robust argument validation and error handling.
- Efficient piece assembly using a single buffer (
complete_file). - Progress logging for user feedback.
- YTS.mx-specific optimizations (e.g.,
announce-listsupport).
Implementation Details
YTS.mx Optimization
- Single-File Torrents: The client assumes
info["length"]exists, aligning with YTS.mx’s single-file torrents. Multi-file torrents are unsupported, throwing “Missing or invalid ‘length’ field”. - Tracker Support:
select_tracker_urlprioritizesannounce-list, common in YTS.mx torrents, ensuring reliable tracker connections. - HTTP/HTTPS Only: Validates tracker URLs to prevent “Unsupported protocol” errors, as YTS.mx uses HTTP/HTTPS trackers.
- JSON Type Safety: Checks
peersis a string to avoid[json.exception.type_error.302]errors, handling YTS.mx’s compact peer lists.
Error Handling
- Throws
std::runtime_errorfor invalid inputs, network failures, or protocol violations. - Provides descriptive error messages (e.g., “Invalid piece index”, “CURL request failed”).
- Validates torrent fields and tracker responses to prevent crashes.
Efficiency
- Uses compact peer lists (
&compact=1) for smaller tracker responses. - Downloads pieces in 16KB blocks to minimize network overhead.
- Reuses CURL and socket resources efficiently.
- Minimizes memory usage with in-place buffer operations.
Build and Run Instructions
Prerequisites
- C++23 Compiler: GCC or Clang supporting C++23.
- Dependencies:
libcurl: For HTTP requests.openssl: For SHA-1 hashing.nlohmann/json: JSON parsing (included inlib/nlohmann/json.hpp).
- vcpkg: Dependency manager (set
VCPKG_ROOTenvironment variable). - CMake: Build system.
Installation
-
Clone the repository:
git clone <repository_url> cd codecrafters-bittorrent -
Install dependencies via
vcpkg:export VCPKG_ROOT=/path/to/vcpkg $VCPKG_ROOT/vcpkg install curl openssl -
Build the project:
./your_program.shThis runs
cmakeand builds the executable inbuild/bittorrent.
Usage
Run commands using ./your_program.sh <command> [args]:
-
Display torrent info:
./your_program.sh info sample.torrent -
List peers:
./your_program.sh peers sample.torrent -
Download a piece:
./your_program.sh download_piece -o piece.bin sample.torrent 0 -
Download entire file:
./your_program.sh download -o movie.mp4 sample.torrent
Testing
- Use a YTS.mx single-file torrent (e.g., a movie file).
- Verify
downloadsaves the file correctly. - Check
infooutput matches torrent metadata. - Debug issues by adding
std::cout << torrent.dump(2) << std::endl;to inspect JSON structures.
Concepts Learned
This project was a deep dive into several technical domains, reinforcing key programming and networking concepts:
-
BitTorrent Protocol:
- Understood bencode format for encoding torrent data.
- Learned tracker communication (HTTP GET requests with parameters like
info_hash,peer_id). - Mastered peer-to-peer protocol, including handshakes, bitfields, interested/unchoke messages, and piece requests.
- Grasped piece hashing and verification using SHA-1.
-
Network Programming:
- Implemented TCP sockets for peer connections using
<sys/socket.h>. - Used non-blocking I/O with
fcntlandselectfor robust connections. - Handled timeouts and errors in network operations.
- Utilized CURL for HTTP requests, managing callbacks and responses.
- Implemented TCP sockets for peer connections using
-
C++23 Programming:
- Leveraged modern C++ features (e.g.,
std::string,std::vector, range-based loops). - Used
nlohmann::jsonfor flexible data parsing. - Applied exception handling for robust error management.
- Optimized memory usage with move semantics and in-place operations.
- Leveraged modern C++ features (e.g.,
-
Cryptography:
- Computed SHA-1 hashes for info hash and piece verification.
- Encoded binary data (hashes) for URL parameters.
-
File Handling:
- Read binary
.torrentfiles usingstd::ifstream. - Wrote downloaded pieces and files using
std::ofstream. - Managed temporary files for piece assembly.
- Read binary
-
Build Systems:
- Configured CMake for cross-platform builds.
- Integrated
vcpkgfor dependency management. - Wrote shell scripts for local execution.
-
Error Handling and Debugging:
- Implemented comprehensive error checks for inputs, network, and protocol.
- Used JSON dumping for debugging torrent structures.
- Learned to diagnose CURL and socket errors.
-
Protocol Optimization:
- Prioritized
announce-listfor YTS.mx compatibility. - Used compact peer lists for efficiency.
- Limited peer attempts to balance reliability and performance.
- Prioritized
Challenges and Solutions
- Challenge: Parsing bencode robustly.
- Solution: Implemented recursive parsing with position tracking and error checks.
- Challenge: Handling YTS.mx’s
announce-list.- Solution: Wrote
select_tracker_urlto prioritize valid HTTP/HTTPS trackers.
- Solution: Wrote
- Challenge: JSON type errors in tracker responses.
- Solution: Added
is_stringchecks forpeersfield.
- Solution: Added
- Challenge: Reliable peer connections.
- Solution: Used non-blocking sockets with timeouts and retry logic.
- Challenge: Optimizing for single-file torrents.
- Solution: Removed multi-file logic, focusing on
info["length"].
- Solution: Removed multi-file logic, focusing on
Future Improvements
- Multi-File Support: Add
info["files"]parsing for broader torrent compatibility. - Parallel Downloads: Use threads to download multiple pieces simultaneously.
- UDP Tracker Support: Extend
select_tracker_urlto handleudp://trackers. - Progress Bar: Enhance
downloadwith a visual progress indicator. - Configuration: Allow user-specified peer IDs and ports via arguments.
Conclusion
This BitTorrent client is a testament to the power of C++ in implementing complex network protocols. By focusing on YTS.mx torrents, it achieves simplicity while demonstrating core BitTorrent functionality. The project deepened my understanding of networking, cryptography, and modern C++ programming, equipping me with skills to tackle distributed systems and protocol design. It’s a showcase of practical engineering, from parsing binary formats to managing peer connections, and serves as a foundation for future enhancements in P2P technology.