How Carta uses machine learning to create market-driven compensation benchmarks

How Carta uses machine learning to create market-driven compensation benchmarks

Authors: Josh Steinfeld, Josh Zastrow
|
Read time:  5 minutes
Published date:  22 January 2025
The secret to hiring and retaining the best talent? Compensation benchmarks that accurately reflect what's going on in the labor market. Find out how Carta uses machine learning to help companies develop a competitive compensation plan.

Figuring out employee compensation can be tricky. Often, making an offer is a shot in the dark based on the disparate information you get online or through your network. You could consult with a law firm and pay expensive hourly rates, or use survey data from compensation consulting firms that’s updated once or twice a year. Not only is this cadence not fast enough for the quickly changing job market, but the data also lacks information on startup equity compensation, so it doesn’t show the whole picture. 

These limitations make it especially difficult to create competitive compensation packages for specialized roles. Let’s say you need to hire an AI & machine learning engineer, or a PhD research scientist. Understanding what people with the right skills and experience are worth can be challenging because there’s not nearly as much data for those roles as there is for software engineers, for instance.

Raw compensation data can also be volatile and contain outliers, such as unrealistic salary increases between consecutive levels, or entry level roles getting paid more than senior roles. Smoothing out the data can take a lot of time and manual effort—a job that often falls to full-time compensation analysts or HR professionals.

Why compensation bands matter

A compensation band is an upper and lower range of compensation that you would be willing to pay someone in a specific role. Bands are important because they help you make fair and competitive compensation offers. The idea is that employees in the same job category and level (e.g. senior software engineers) will be paid within the same band.

At the same time, the range within each band allows you to factor in education, experience, and performance. You might start an employee at the lower end of the band and give them a path to the higher end through promotions, incentivizing them to stay at your company and develop their skills. 

For that reason, compensation bands go hand in hand with levels. If an employee is performing well, there should be a clear path for them to take on more responsibility and move up to the next level (and a higher band).

However, bands only work if they’re competitive—meaning they need to reflect what’s going on in the labor market. That’s why it’s so important to leverage fresh data when benchmarking compensation for internal roles against current market conditions.

Using data to develop fair, competitive benchmarks

Carta Total Comp leverages real-time compensation data from over one million employees to produce meaningful, market-driven benchmarks. By precisely mapping employees to their correct roles and seniorities, then applying advanced statistical modeling to reflect current market trends, the platform can estimate salary and equity compensation for a variety of roles—across different levels, geographies, industries, and company sizes and valuations. Even for specialized roles with limited market data, we can produce sensible pay bands by analyzing real benchmarks from employees and companies linked to the Carta network.

Our machine learning team trains these benchmarking models on a huge dataset that’s updated every quarter. This allows the system to adapt as new data comes in, ensuring its predictions keep up with market trends.

CTC bands

How the machine learning works

Carta’s machine learning algorithms perform two key functions: role matching (using large language models (LLMs)) and compensation benchmarking (using specialized statistical models).

Role matching

We start by matching unstructured employee information—such as job titles, compensation, tenure, and company size—to a clear job taxonomy. The goal is to identify a department, role, and seniority level for each employee so we can assign their compensation to the correct benchmarks.

This data is drawn from over a million employee records across multiple sources, including HR integration systems (HRIS) and our core cap table product. After being filtered and cleaned, the data is fed into large language models (LLMs) that use the relevant  context for classifying department, role, and level for each employee.

CTC LLM

Despite such broad coverage, our system maintains a high level of accuracy and scales smoothly to support our growing customer base and taxonomy. We use feedback from user-corrected entries and previously reviewed records to continually benchmark the algorithm’s performance (i.e. recall and precision).

Compensation benchmarking

Enriching employee data with discrete  job categories allows us to benchmark across more than 120,000 slices of compensation data. These slices are pre-defined subsets, such as:

  • Compensation type (e.g. first grant fully diluted percent)

  • Company size (e.g. 100-500 employees)

  • Office location (e.g. California and New York)

  • Industry (e.g. healthcare and life sciences)

  • Department (e.g. Research)

  • Staff seniority level (e.g. manager)

  • Specialization (e.g. PhD scientist)

The first step is to  group underlying data by company valuation, geography, industry, and our role-matched taxonomy (i.e. department, role, and seniority)  then we calculate 25th to 90th percentiles to reflect actual market distributions in salary data.

These baseline statistics are then fed into tailored  machine learning models that are trained to reflect the underlying data, real-world compensation expectations, and common sense principles. This approach ensures the resulting pay ranges and progressions are not only highly accurate but also intuitive and aligned with industry norms for different levels, percentiles, and company sizes.

For example, penalizing the model when it estimates L4 pay to be higher than L5 teaches it that compensation should increase with seniority. Equally, a principle like “early-stage companies typically grant a higher ownership percentage than late-stage companies” helps the model understand that one percent of equity in a $1 billion company is worth a lot more than the same stake in a $1 million company.

Observability at Carta

Comprehensive monitoring helps us maintain stable trends and reliable estimates across retraining cycles. Our team also regularly reviews the underlying market data and model behavior across all benchmarks. With the help of visual dashboards, we monitor:

  • Sample size—The number of compensation data points available for each specific segment, such as a role, level, or industry. Larger sample sizes within a slice ensure more reliable and stable compensation estimates.

  • Data recency—How up-to-date the market data is. More recent data provides a better reflection of current compensation trends.

  • Accuracy—How close the model’s benchmark estimates are to underlying compensation values.

  • Drift—Changes in the underlying market data over time. By comparing the characteristics of new and previous datasets, this metric helps to identify whether shifts in compensation estimates are driven by actual market trends or reflect inconsistencies in the data collection process.

Building a compensation plan for your company

With Carta Total Comp, you can develop a competitive compensation plan that’s aligned with real-world benchmarks. In addition to helping you decide what to pay individual employees, the platform automatically identifies roles that are paid significantly above or below the market rate. This approach reduces the time and error involved in manually maintaining pay bands, making it easier to hire and retain top talent as the market moves and your business grows.

Carta’s machine learning system does the groundwork, but you’re in control of creating a program tailored to your company’s specific needs. While Total Comp creates 50th percentile compensation bands for each employee by default, the target position can be changed at any time in your account settings. You can also customize your settings by role and type of compensation—for instance, if you wanted to pay in the 75th percentile for salary and the 25th percentile for equity. Any adjustments you make will be automatically reflected in the relevant employees’ bands, helping you stay consistent even at scale.

Start building your data-driven compensation program
Request a demo

Josh Steinfeld
Josh Steinfeld leads product strategy for Carta Total Compensation. Josh has been a compensation professional for the last 20 years, most recently leading compensation at Google for YouTube and Google’s corporate functions.
Josh Zastrow
Author: Josh Zastrow
Josh Zastrow is a Staff Machine Learning Engineer at Carta. With over 10 years of experience in applied machine learning, Josh has built out services that transform complex workflows into efficient systems across multiple industries—including property, retail, and finance.

DISCLOSURE: This communication is on behalf of eShares, Inc. dba Carta, Inc. ("Carta"). This communication is for informational purposes only, and contains general information only. Carta is not, by means of this communication, rendering accounting, business, financial, investment, legal, tax, or other professional advice or services. This publication is not a substitute for such professional advice or services nor should it be used as a basis for any decision or action that may affect your business or interests. Before making any decision or taking any action that may affect your business or interests, you should consult a qualified professional advisor. This communication is not intended as a recommendation, offer or solicitation for the purchase or sale of any security. Carta does not assume any liability for reliance on the information provided herein. © 2024 Carta. All rights reserved. Reproduction prohibited.