Hate ads? Verify for LSD+ → Learn More

How the Admissions Predictor Works

Understand the gradient-boosted model behind LSD.Law's admissions chances predictor — what data it's trained on, the 54 features it uses, and the three-way probability it returns.

The admissions predictor is a LightGBM gradient-boosted decision tree, trained in Python and served via ONNX inference inside LSD.Law. For any applicant-school pair, it returns the probability of acceptance, waitlist, and rejection — three numbers that sum to 100%. A second heuristic stage then adjusts for historical waitlist-to-accept conversion.

Training data

The model is trained on self-reported applicant cycles from LSD.Law, starting with the 2019 matriculating cycle and running through the most recent complete one. Each row is one applicant's outcome at one school. Cycles are weighted by exponential decay so recent data dominates and the model tracks current admissions patterns rather than historical ones. Rows with corrupt or late-cycle waitlist timing are filtered out, and stale-pending applications from completed cycles are relabeled as rejections at reduced weight to offset reporting bias.

Training uses forward-chaining temporal cross-validation — it fits on earlier cycles and tests on the next year, never on the future — so accuracy estimates reflect how the model would have performed at deployment. A final model is then fit on all cycles and calibrated via Platt scaling.

What the model considers

The predictor uses 54 features grouped into several categories:

  • Applicant: LSAT, GPA, URM status, international status, non-traditional status, military background, years out of undergrad, softs tier, character & fitness
  • Application: application type (regular, early decision, priority), ED-to-RD conversion, in-state status
  • School: school identity, 25/50/75 LSAT and GPA percentiles from the most recent ABA report, acceptance rate, class size
  • Distance: how far your LSAT and GPA sit from the school's 25th, 50th, and 75th percentiles
  • Trend: year-over-year changes in the school's median LSAT, median GPA, and acceptance rate
  • Timing and survival: current day in the cycle, when you sent and completed, how long you've been waiting, and the fraction of decisions already released at that school (overall and split by outcome)
  • Waves: whether today falls inside a detected decision wave, days since the last wave, days until the next, and rejections released in passed waves
  • Signal: whether you've had an interview, how long you've been under review, whether you've had a UR2 status change
  • Expanding-window cycle stats: the running median LSAT and GPA of applicants already accepted at that school this cycle, and how far your own stats sit from those medians — the model's only live signal about the current cycle's class profile before the ABA report exists

Reading the output

The predictor returns three calibrated probabilities — P(Accept), P(Waitlist), P(Reject) — that sum to 100%. These are point estimates, not credible intervals. On the predictor page they appear as a colored bar, with a timeline chart underneath.

Two views are shown. The unconditional probability is the headline number — your overall chance of each outcome, marginalized over when in the cycle you might hear back. The conditional probability asks the timing-aware question: given that you're still pending on day X, what are your updated odds? The predictor computes this by sweeping through every day of the cycle with the model's timing, survival, and wave features set to that day. The conditional view is most useful mid-cycle, when silence from a school is itself informative.

A second chart shows a cumulative decomposition: by day X, what fraction of your probability mass has resolved into accept / waitlist / reject, and how much is still pending. Late-cycle waitlist probability is clamped once the model's curve falls off its peak, so residual WL mass folds into rejection where appropriate.

Known limitations

  • Schools with few data points are noisier. The model is trained on self-reported LSD.Law outcomes, so schools with thin historical coverage produce less reliable point estimates. The predictor does not surface per-prediction error bars — treat tight outputs at sparsely-covered schools with appropriate skepticism.
  • LSAT and GPA are required. If you don't supply them, the predictor will not run at all. Optional applicant fields (softs, years out, non-trad status) left blank are handled as missing values by the tree model, not silently imputed.
  • The model does not read your application. It cannot see your personal statement, letters of recommendation, or other qualitative materials. It predicts based on quantifiable factors only.
  • Early-cycle predictions have less signal. Timing, survival, and wave features carry their full weight later in the cycle, once a school has released enough decisions for the model to locate you against its historical release pace.
  • New schools and new cycles have gaps. Prediction requires the target school to have ABA data on file. For the in-flight cycle, the expanding-window features are the model's only live view of the class being built — before any decisions are released, it falls back on last cycle's profile.