Skip to main content

Projects

Large Language Models & AI Safety

Project	Paper Title	Venue	Description	Links
LLM-DNA	LLM DNA: Tracing Model Evolution via Functional Representations	ICLR 2026 (Oral)	Training-free framework for tracing LLM evolution via functional representations	Paper Website
LLM-Deception	Beyond Prompt-Induced Lies: Investigating LLM Deception on Benign Prompts	ICLR 2026 (Oral)	Investigating LLM deceptive behavior on benign prompts using graph connectivity problems	arXiv
DGP	DGP: A Dual-Granularity Prompting Framework for Fraud Detection with Graph-Enhanced LLMs	AAAI 2026	Dual-Granularity Prompting Framework for fraud detection with graph-enhanced LLMs	arXiv
Llamdex	Model-based Large Language Model Customization as Service	EMNLP 2025 Main	Model-based LLM customization service - upload models instead of data	Paper
MegaAgent	MegaAgent: A Large-Scale Autonomous LLM-based Multi-Agent System Without Predefined SOPs	ACL 2025 Findings	Large-scale autonomous LLM-based multi-agent system with dynamic task decomposition	arXiv ACL
CryptoTrade	CryptoTrade: A Reflective LLM-based Agent to Guide Zero-shot Cryptocurrency Trading	EMNLP 2024	Reflective LLM-based agent for cryptocurrency trading with on-chain and off-chain data analysis	Paper

Federated Learning & Privacy

Project	Paper Title	Venue	Description	Links
FeT	Federated Transformer: Multi-Party Vertical Federated Learning on Practical Fuzzily Linked Data	NeurIPS 2024	Multi-party VFL framework for fuzzy identifiers (46% accuracy improvement at 50 parties)	arXiv
LLM-PBE	LLM-PBE: Assessing Data Privacy in Large Language Models	SIGMOD 2024 (Best Paper Nomination)	Toolkit for systematic evaluation of data privacy risks in LLMs	Website
VertiBench	VertiBench: Advancing Feature Distribution Diversity in Vertical Federated Learning Benchmarks	ICLR 2024	Benchmark for vertical federated learning with diverse feature distributions and imbalance	arXiv Website
ModelGo	ModelGo: A Practical Tool for Machine Learning License Analysis	WWW 2024 (Oral)	License analysis tool for machine learning projects with ML-specific licensing framework	-
FedTree	FedTree: A Federated Learning System For Trees	MLSys 2023	Federated learning system for tree-based models with HE, secure aggregation, and DP	Docs
FedGMA	Communication-Efficient Generalized Neuron Matching for Federated Learning	ICPP 2023	Communication-efficient federated learning with generalized neuron matching	-
FedOV	Towards Addressing Label Skews in One-Shot Federated Learning	ICLR 2023	One-shot federated learning framework addressing label skew challenges	-
FedSim	A Coupled Design of Exploiting Record Similarity for Practical Vertical Federated Learning	NeurIPS 2022	Coupled VFL framework leveraging record similarities for improved performance	-
NIID-Bench	Federated Learning on Non-IID Data Silos: An Experimental Study	ICDE 2022	Comprehensive FL benchmark for non-IID data with 4 algorithms and 9 datasets	-

GPU-Accelerated Machine Learning

Project	Paper Title	Venue	Description	Links
DeltaBoost	DeltaBoost: Gradient Boosting Decision Trees with Efficient Machine Unlearning	SIGMOD 2023 (Honorable Mention for Best Artifact Award)	GBDT-based model with efficient machine unlearning capability	-
ThunderSVM	ThunderSVM: A Fast SVM Library on GPUs and CPUs	JMLR 2018	Fast SVM library on GPUs and CPUs with scikit-learn interface	Docs
ThunderGBM	Exploiting GPUs for Efficient Gradient Boosting Decision Tree Training	IEEE TPDS 2019 (Best Paper), JMLR 2020	Fast gradient boosted trees and random forests on GPUs (10x speedup)	Docs

Graph Processing Systems

Project	Paper Title	Venue	Description	Links
RidgeWalker	RidgeWalker: Perfectly Pipelined Graph Random Walks on FPGAs	HPCA 2026	FPGA accelerator for graph random walks with zero-bubble scheduler	-
Clementi	Clementi: Efficient Load Balancing and Communication Overlap for Multi-FPGA Graph Processing	SIGMOD 2025	Multi-FPGA graph processing framework with near-linear scalability (1.86-8.75x speedup)	-
RUSH	RUSH: Real-time Burst Subgraph Detection in Dynamic Graphs	VLDB 2024	Real-time fraud detection framework for dynamic graphs with burst subgraph discovery	Paper
ThunderGP	ThunderGP: Resource-Efficient Graph Processing Framework on FPGAs with HLS	ACM TRETS 2022 (Best Papers in FPGA 2021), FPGA 2021	HLS-based graph processing framework on FPGAs (fastest on HLS-based FPGAs)	-
G3	G3: When Graph Neural Networks Meet Parallel Graph Processing Systems on GPUs	VLDB 2020 Demo	Programmable GNN training system on GPU with graph-centric optimizations	Demo Video
Medusa	Medusa: Simplified Graph Processing on GPUs	IEEE TPDS 2013	GPU-based parallel sparse graph processing with sequential C/C++ code	-
RICH	RICH: Real-time Identification of negative Cycles for High-efficiency arbitrage	-	Real-time negative cycle detection for arbitrage opportunities in token graphs	-

Stream Processing

Project	Paper Title	Venue	Description	Links
OEBench	OEBench: Investigating Open Environment Challenges in Real-World Relational Data Streams	VLDB 2024	Benchmark for open environment challenges in relational data streams (55 datasets)	-
BriskStream	BriskStream: Scaling Stream Processing on Multicore Architectures	SIGMOD 2019	Multicore, NUMA-optimized data stream processing system	arXiv
PyOE	PyOE: Python Library for Data Stream Learning	-	Machine learning library for data stream learning with 6 tasks support	Website

Hardware Acceleration & Optimization

Project	Paper Title	Venue	Description	Links
HIPACK	HiPACK: Efficient Sub-8-Bit Direct Convolution with SIMD and Bitwise Management	MICRO 2025	Sub-8-bit direct convolution acceleration for ARM processors (3.2x+ speedup)	-

Large Language Models & AI Safety
Federated Learning & Privacy
GPU-Accelerated Machine Learning
Graph Processing Systems
Stream Processing
Hardware Acceleration & Optimization