Welcome Dwarkesh fans,
We’re Dwarkesh fans too and we’re pleased to be an ongoing sponsor of his content. We’ll describe how each episode we sponsor relates to our work, and describe our efforts in ML and related topics over time.
How our work relates to topics in these episodes:
Nov 13
Gwern Branwen - How an Anonymous Researcher Predicted AI's Trajectory
In his conversation with Dwarkesh, Gwern said "There are some places you go and people just work well together. There's nothing specific about it, but for whatever reason they all just click in just the right way." Couldn’t have said it better! There's a unique culture at Jane Street where everything just seems to fall into place when we work together. We hire great people, the work is interesting, and our focus on people creates a culture of collaboration that makes work here rewarding and fun.
Oct 2
Dylan Patel & Jon (Asianometry) – How the Semiconductor Industry Actually Works
This episode was particularly relevant to our work. Choosing hardware and using it efficiently is an important part of any machine learning effort. Compared to typical LLM training runs, we have larger datasets with a different structure, and in some cases much stricter latency requirements on the inference side. As a result, we end up with a mix of CPUs, GPUs, and FPGAs, and are exploring yet more esoteric technologies.
To support these efforts, we're hiring experts in CUDA, FPGA programming, and datacenter engineering. While people move between parts of the stack and collaborate across these boundaries, we do have separate job postings depending on whether you're most interested in FPGAs, CUDA and performance optimization, or mostly the Python layer. Don't stress too much about where to apply, we'll get you on the right track once we get to know you. People here often work on multiple parts of the stack, in a fairly organic way.
Jane Street Kaggle: Real-Time Market Data Forecasting
Predict financial market responders using real-world data.
Want a glimpse into the daily challenges of successful trading? We’ve just launched a new Kaggle competition so you can try your hand.
We hired the winner of our last Kaggle and he organized this one. This challenge highlights the difficulties in modeling financial markets, including fat-tailed distributions, non-stationary time series, and sudden shifts in market behavior. It has bigger data than our last competition, more sophisticated features, various auxiliary responders, and provides a lagged responder so participants can try with online learning, etc. There’s a variety of ways to play with the data in this competition and we want people to have fun!
1st Place: $50,000 |
2nd Place: $25,000 |
3rd Place: $10,000 |
4th - 10th Place: $5,000
Deeper Learnings
So, what could you do here?
ML Engineers help drive the direction of an ML platform that is used daily by traders and researchers. The work is wide-ranging, including things like developing libraries for automating ML workflows and experiment evaluation, digging into the internals of open‑source ML tools, and optimizing our systems to match the needs of our trading systems.
ML Performance Engineers optimize the performance of our models. This work focuses on efficient large-scale training, low-latency inference in real-time systems and high-throughput inference in research. Engineers take a whole-systems approach, including storage systems, networking and host- and GPU-level considerations.
ML Researchers are responsible for building models to price securities and execute trades in live trading systems. A mix of trading and software engineering roles, this work involves analyzing large datasets, building and testing models, creating new trading strategies, and writing the code that implements them.
ML Interns are paired with full-time mentors, collaborating on real-world projects and learning how Jane Street applies advanced machine learning and statistical techniques to model and predict moves in financial markets. Through a series of classes and activities, they analyze real trading data via access to our growing GPU cluster containing thousands of A/H100s. Over the course of the program, interns will gain an understanding of the differences between textbook machine learning and its application to noisy financial data.