-
Notifications
You must be signed in to change notification settings - Fork 10
Port CMU 15-445 to the bench #97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
tareknaser
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the great work. This looks almost ready to merge. I made a few small updates including adding a course entry and a reference solution (based on Claude’s trajectory) and rebasing on top of main. I’ll add a couple more minor updates in separate comments for you to review.
If everything looks good, we can go ahead and merge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we can simplify this file to be
#!/bin/bash
set -e
echo "=== Setting up CMU 15-445 CountMinSketch Lab ==="
cd /workspace
echo "Installing git"
apt-get update > /dev/null 2>&1
apt-get install -y git > /dev/null 2>&1
echo "Cloning bustub repository"
git clone https://github.com/cmu-db/bustub.git /tmp/bustub > /dev/null 2>&1
git -C /tmp/bustub checkout bd3912741c45370d5f9c7bef638452b10b140138 > /dev/null 2>&1
echo "Moving source to workspace"
mv /tmp/bustub/* ./
mv /tmp/bustub/.clang-format ./ 2>/dev/null || true
mv /tmp/bustub/.clang-tidy ./ 2>/dev/null || true
rm -rf /tmp/bustub .git
echo "Installing build dependencies"
build_support/packages.sh -y > /dev/null 2>&1
echo "Creating checksums for protected files"
mkdir -p /tmp/checksums
sha256sum test/primer/count_min_sketch_test.cpp > /tmp/checksums/test.sha256
echo "Building project"
mkdir -p build && cd build
cmake -DCMAKE_BUILD_TYPE=Debug .. > /dev/null 2>&1
make -j$(nproc) > /dev/null 2>&1
echo "Setup complete"
echo "Agent should implement:"
echo " - src/include/primer/count_min_sketch.h"
echo " - src/primer/count_min_sketch.cpp"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And the evaluation script to be
#!/bin/bash
set -e
cd /workspace
# Verify test file wasn't modified
echo "Verifying protected files were not modified"
if ! sha256sum -c /tmp/checksums/test.sha256 > /dev/null 2>&1; then
echo "FAIL: test/primer/count_min_sketch_test.cpp was modified"
exit 1
fi
echo "Protected files unchanged"
# Build
echo ""
echo "=== Building ==="
rm -rf build
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Debug .. > /dev/null 2>&1
if ! make -j$(nproc); then
echo "FAIL: Build failed"
exit 1
fi
# Run tests
echo ""
echo "=== Running Tests ==="
make -j$(nproc) count_min_sketch_test > /dev/null 2>&1
if ! ./test/count_min_sketch_test; then
echo "FAIL: Tests failed"
exit 1
fi
# Format check
echo ""
echo "=== Format Check ==="
make format > /dev/null 2>&1
if ! make check-clang-tidy-p0; then
echo "FAIL: clang-tidy check failed"
exit 1
fi
echo ""
echo "PASS: All checks passed"
exit 0
There is no need to have scoring scheme since we just report pass/fail. What do you think?
Thank you Tarek! I will add more tests to this PR to scale it up~ |
|
@Jackcuii For tests, do you mean that we can have more? Can you please add more tests as soon as possible? We need to merge this PR. Thanks a lot. |
Hi Xuan! Yes we can have more! Sorry for being late, I am heading back home these days. I will push hard after I arrive home on 4😃. I possibly need to change the workflow of test to a 'consecutive test' which means I need to run the all 4 tests left continuously. That is because the lab2,3,4 of 15-445 needs to be based on the last lab. However, we do not have golden version of the project. So we need to make the agent consecutively work on the 4 labs in one go. |
This is a Draft PR
Description
This PR adds CMU 15-445 Lab 0 (Count-min Sketch) to the Benchmark Suite. The task requires implementing a thread-safe Count-min sketch data structure, a probabilistic data structure used for frequency estimation in streaming data. This lab focuses on C++ programming, concurrency, algorithms, and database systems concepts.
Changes
data/cmu_15-445/task_cpp/with complete lab setupTesting
E2E Tested with Claude Haiku
TODOs