BERTBERT | Nickolas Chua

Overview

Developed for TikTok TechJam 2025, BERTBERT is a multi-tower deep learning architecture designed to classify restaurant feedback from reviews. The model processes over 100K reviews while handling severe class imbalance in the training data.

Technical Highlights

Multi-Tower Architecture: Designed a cross-attention architecture that processes text reviews alongside 20 metadata inputs (rating, time, location, user history, etc.) for more accurate classification.
Custom Labelling Solution: Built a labelling pipeline that outperformed GPT-4o on domain-specific classification tasks, achieving higher accuracy on edge cases and ambiguous reviews.
Class Imbalance Handling: Implemented specialized sampling and loss weighting strategies to handle the severe class imbalance inherent in review datasets where most reviews are positive.

Scale

Processed and trained on 100K+ restaurant reviews with real-world noise and inconsistencies.

Technical Challenge

The Problem: Severe class imbalance — fewer than 50 combined instances of spam, advertisement, and rant across 1,500 manually labeled samples. The advertisement class ended up with 0% F1-score because we simply didn't have enough examples to learn from.

The Solution: We rejected SMOTE (synthetic oversampling would just amplify noise) and instead supplemented our Google Reviews with Yelp data to naturally increase minority class representation. Combined with logarithmic class weighting and PCA compression (768→128 dimensions, preserving 85.43% variance) to work within GPU constraints.

Why It Matters: We learned that "most of the 80% problems are data-related" — model architecture matters less than having quality labeled data for every class you're trying to predict.