Good afternoon.
We're going to be discussing how smarter campaigns start and end with data science. And, I guess, flip to the next page.
To introduce ourselves, I'm Jane Johnson. I'm a client strategist in our data-driven marketing division. And I'll apologize now about my voice, but I'll try to make it through.
And hi. My name is Sushaanth. I work in a day science team. I'm a data scientist. I've been with Deluxe for six years. So, we're going to skip introductions and move in, we have a lot of slides to cover today.
Okay. So, you have basically heard about AI and, in the morning and a lot of, like, using analytics and all that jargon so far. Right? I wanted to start off this presentation, right, by demystifying what those jargons are because it's been so much hyped and that is too much misinformation about.
So, I wanted to basically straighten it up from a marketing standpoint, because all of you are most of them are here marketers, right? So, what is data science, right? The simplest answer is whatever you do using data to drive insights for a given business case to get your ROIs is data science.
It could be as simple as, for example, using a FICO score and basically targeting people that are maybe seven hundred and above, unified your responses are going up, that's also data science, and it could go all the way to a complex strategy where for a HELOC, you want it like to find different pockets of people, people who need money for consolidation, people who need money to do home improvements, people who need money for some kind of emergency. Right?
Now, you need to find all of those tough buckets, which means you have right, do have different logics to target each of those groups and basically marry all of them together to basically target that whole group. That is also data science. So, now we have two different pieces, but in the end, the final goal of what data science does is it helps you, right, to basically get insights that would take you and solve your business case, right, using the data at hand. Now having said that, we go to the next jargon, which is models.
Right? So, we try to under so in this slide, we're going to talk about what a model is, why do we need a model in the first place? Right? We don't need models for everything.
Right? And then in the end, how are we going to build a model? Now let's start off with the simple thing, what a model, right? A model is a black box, right, where you take it takes data in, and it tries to solve a specific business problem.
Right? It gives you direction on how you could solve a business problem. Now, you would have seen a lot of models. That's the example of models that you have.
Right? Everyone uses cycle. FICO is a model.
Now, for every model that needs to be a business case, why do you need a model, right? Why do we need FICO? Because it's hard to find people that are non-risky. Right?
Because there are a lot of information that's stored, and you get to somehow figure out who is non risky. So, 5 is embedded in all our world lives at this standpoint. Right? So, we have the why.
Why do we need the model? Right?
Now, what does the model do? So, what does FICO do? It takes in all the information about you and basically says whether you are risky or not. How does it say that?
It's going to give you a directional number here. It's a number. In other cases, it will give you a probability, like heads or tails, right? And you're going to define your business case by looking at that number.
In case of FICO, higher the FICO, higher, the lower is your risk.
Lower the fiber, higher risk, right? Now that is a model. And model could be a very simple model, right, like you take, let's say you create a math equation, a math equation is also a model if it basically gives you a final output. That as with respect to FICO, FICO is not a simple model.
FICO is trying to model on multiple pieces of risk information, right? And it's basically giving you a final score. So, it could be a group of all models, different models together. I'll give you another example.
Let's say, for the same example that I said about HELOC, you could have specific models that target, a model to target debt consolidation, a model that targets people who want to, like, do home improvements, you pick it to also have another model that basically tries to find people that are needing money for an emergency purpose, assuming. Now, you marry all that results, which would also be another model, in the hand, you would have one score. And now that one score is going to enable you to take actionable insights and to go on, basically, target those people.
Now, we explained about what a model, why a model, and how is a model. Right? And now let's come out with the how, how to build a model.
Every model needs data, right? The first and foremost thing is we need a business case for a model, Second is we need the data that supports the business case because if you don't have the data, you won't be able to support the business case, right? For example, let's say you're building a HELOC model, and if you don't have any data of any data on mortgages, you won't be able to build a model. Right?
You need, you need some data about the person's intent to basically do a mortgage or do a HELOC. If you don't have those signals with you, you won't be able to build them out. So that's the second, you need data, after the business case, and then you need to also have a good machine learning process. Now, throwing another jargon, what's the mission learning, right?
Let's talk about that right now. So, we talked about data science, right? Now, let's talk about how all of these are connected, using a marketing example. A machine learning algorithm, like, machine learning process, what it's going to do is going to take the data it's going to shift through all the insights that it could find in your asset, in your data set, and it's going to, it's going to give you a score.
The score could be a probability, or the score could be a number like WiFi Co is, right? And once that is given, that's what machine learning does. It just tries to find signals. And gives you an insight on what how we are going to have.
Now what data science does is we basically try to ensure that whatever the model is saying, is it right or wrong? Is it same? Is it giving you a compliant answer? Is it working?
Is it hitting your ROIs? That's what data science does? So, at the data science level. Now, what is AI?
AI is automated data science, where you don't need a person to sit there and do stuff. For example, let's take charge GPT, right, for how is charge GPT build, right, charge GPT is like you basically ask a question. It will give you an answer. Now after you get an answer, you're not satisfied with the answer, say, give me another answer.
It's going to rephrase it, and it's going to give you another answer. Right? So, it's going to take all the information that you get, trying to, like, learn on it, and it's going to give you an answer. Right?
So, that's AI. So, from a marketing standpoint, right, as you, as as we build a model, you would get combined results. That gets fed in, the model will basically the AI system would basically take those incoming data, right, it tries to understand how we could better the model that we already have, and it's going to put it into production after it rebuilds it, so you don't need data science. So, data science is automated by AI.
And that's exactly what Zack in the in the morning keynote was also saying. It's almost automating all the mundane work that you have to do so that we could focus on much better work. Right?
So, that's that. Now, final gist of this presentation is for any successful model, we need a business case. We need supporting data, and we need a good model building process.
Now moving forward, let's talk about data. Right? So, for example, right? We at Deluxe have a wide range of data. So, we have different signals of different data assets, right? For example, we would be, for example, if you want to do a Hila campaign, we'll be able to marry the credit data, with the business with the consumer with this information about the consumer and also basically marry it with trigger information.
Right? Now let's go to another use case where you want to find people who want to buy beds. Right? It's also a use case.
Now, what signals do you need? We have those signals with us. Right? We could basically find people who move, when a people move, we need, we need debts.
Right? You could also find people using, like if, for example, have a kid, a life event, they may need a bed, right, even furniture, different pieces of information, right? So, the reason why data science is front and center at Deluxe right now is because of the data that we have at hand. So, the data allows us to have enough signal for us to learn something and basically to have actionable insights on the data that we have and support you, and to get your ROIs, right?
Moving forward. Now, you've talked about data. Now, let's talk about a use case. This slide says its model types, but I would like to call it as defining a model. Right? Now, how do we define a model?
Initially, before you even have to start a model, right? We need to know what data goes into the model. Right? The there are two kinds of data.
If the data is provided by the client, for example, a client could come back and say, hey, I want to find people that are my that are like my current customers. I don't want to basically mimic others. I just want customers like mine. Now then, what we have to do is we need to just we need to get information about your customers and try to build a model.
When that happens, I'm building a specific... Sorry. I'm building a specific model for you. So, it's a custom client model because it uses it tries to mimic what your customers are looking like. Now the second piece is third party industry models, or we call golf calls it a Deluxe proprietary model. So, how we end up building that is we have signals that could basically say if a person opened a home or not, because we have access to all public records data, So, we'll be able to track who opened what with which lender at what market state, what was the mortgage loan type, different pieces, right?
So, we would be able to create our own target universes using the data at hand. Now, when we do it using our data from the data lake, then it becomes a proprietary model, a generic model. Now, what's the use of a generic model versus the one that you build from for your client models? Now, you are able to start off fast because once you come in, you will be able to validate the model that it works for you, and you could basically go in and basically do your marketing campaign in like, I would say, maybe a week, a week or so.
If everything validates, yes. So, in in a week, we'll be able to do it, right? The and the second thing is that you're able to get a wide range of different universe that you have never ever seen before. So you're going to find new pockets of people that you have never tested on.
Right, which is going to widen your portfolio. Right? And all of those go still goes through your writing criteria, so you're safe from a risk standpoint.
Now after we define what data goes in, we go in and look at what kind of targeting you want. Right? Most of data, you do individual targeting because you want to use an individual credit asset, right? There are also places where you want to target a household, right, and you when you do on and to an apply standpoint, you would want to target a household.
The household, a person with the highest score or a highest FICO could basically, respond to the product. There are also situations where, for example, checking account acquisitions and for banks, right? You want to tie a marketing campaign to a branch distance. Right?
We want to make sure that each branch gets enough, mail pieces, says that they're able to target them and grow that business. So, that so the next kind of targeting is household press a bench. And then there is also another thing that we do from a carrier route standpoint. I think you guys would all know about carrier routes, right?
So, what the carrier route targeting is, we basically target every house that's in a carrier route, and the model what the model is trying to do is find the best carrier routes that will be beneficial for you, right? And you basically tie those carrier routes to the branches that's nearby so that they... those branches could solve that, basically, convert that person.
And in the end is the product, right?
So, that's the business case coming to a full conclusion. You need to basically identify the product. That's the final question what are you trying to solve for? The problem that you are trying to solve for, whether you want to maximize, your checking account acquisitions you want to maximize the number of people you want to get a personal loan for.
Right? And and this is the list that I've put here is very small. But this list could expand depending on what the data lake supports, right, and what kind of use cases that you have. Right?
This could also, like, have people who want to buy more furniture, people who want to look to basically find people who want to do home improvement loans, right, who want to basically build a pool, example. Right? So, that's that moving forward to the next slide.
Now, let's talk about how a model is built.
It's a very I would say it's a very simple process on a very high level. Why when you go into the black, it becomes more complex. I just wanted to show you how it's looking like. So, we said that we need a business case, we need a data. Right? Now once you have a business case and you have the data, you basically the first step is we try to the data that's needed for the model.
Okay. Come on. After we extract the data, since we have a lot of signal attributes that we have, let's say a data lake has, let's say, six thousand to ten thousand odd attributes.
Those are nothing but signals. I will call them as because each attribute is going to give you some different information about a person, right? So, we have to select the top, the best attributes that would solve for your because some attributes would be useful for you, some attributes would be useful for someone else. Right?
Now, after we do that, next comes into the machine learning models. So, we have multiple you would basically build multiple machine learning models using the data and attributes that we have. And all that it's going to say, is is going to say, what are the attributes that are just connected to what provides maximum lift, what provides good signal. And after you basically find that and you validate that model, because we need to ensure that it's doing its job.
Right? And after we validate the model, we deploy it. Simple. Right? Now, how we have you done it in d three, right?
So, this is how you could see it's it's almost a one to one mapping that you do that we have done it in d three, where initial step is we get the data we do some data transformation in there, right, and then we do feature selection, which is part of this attribute selection piece, and then we do an ML pipeline queue, is nothing this part of the same old development space, model evaluation, which is also part of the same model development, and then we do model reporting. Now, The reason why we do model reporting is we need to know if the model is doing the right job and also from a compliance standpoint, fair lending act and stuff around it, we need to have all the attributes that the model uses and document them.
Right? And after we do that, we basically let let the folks know that the model is built.
Right? Now what is done inside the code is much more complex, right? But the overall process looks the same.
Now, I want to double click on a specific piece here, that's the model report. So, after the model is built, right? How do we validate a model, right?
How would we know that the model is doing a good job? Right? Now we start off with, you know, it's not that visible from that end, but I'll try to talk through it. Right?
So, when a model is built, right, we basically split the data into data that is used to train on it, and the one data that is we used to have it separate, it's almost like the model has not seen the data at all. It's almost like a validation set, right?
So once we find that, we try to after the model is built, we run that on that unseen data, and we basically see how good tomorrow is able to split your targets and your non targets, people who convert, people who not convert, right? The max left is nothing, but you take a data set, you try to target your top 10%.
What's the lift that you get? Now, what is left? Left in a very simple term as percentage of responders divided by percentage of non-responders. Now, with this model, this is an actual model in production right now, this model, you're able to get three times, I would say, four times your targets on your top 10% right, compared to like your control, compared to when you don't have a model at all, which is almost like you target hundred people and out of that hundred people, forty of them will have you or let's say you have hundred targets available in your universe. Out of 100, 40 of them will be your top 10%.
So, if you have hundred people, your hundred targets, you would have 40 of them on the top. Which is almost 4x your performance on your top percentage. Now, the lift, lift of top three is almost like you target 30% of the universe, available universe. You'll be able to get two x of all the, all the 2.5x of all the targets that you could ever get on the top three.
Now, moving on or moving to the attributes. Now, this is an example of the attributes that the models saw to have trends.
Right? So, for example, for this use case, we were able to find that an inquiry that was done in the last six months in auto was predictive, the higher the number of inquiries, the higher the number of response, right? And same goes to number of head sheet, number of hillock transformations. Right? If a person has done a lot of HELOC trans, HELOC, Translox, conversions, he has done a lot of HELOCs in the past, he's more likely to basically take this product.
So, this is this is the level of information that we go in into the model report to explain why the model is doing in order to be compliant and in order to be very transparent with our clients saying how the model is doing.
Moving forward. I'll let Jane take this.
Alright. I'll give him a break.
So, this slide actually kind of shows what our typical, marketing campaign looks like at Deluxe. And, obviously, the first thing we do on the left-hand side is really understand what the business need is. What is our client's, business? What's going on in the marketing space overall?
And then we really try to determine, you know, what strategy be, is there a business case, for data science to be part of this? And if there is, then we, obviously involve our data science team. They're going to go along the top side, start looking at the data, start doing segmentation and analysis, building the models of appropriate, and at the same time, you know, as a client strategist, I'm going to work along with our marketing campaign managers, talk about any creative testing that we want to run and kind of put together the rest of that campaign strategy. And then when the models are complete, we do a model evaluation, really work together with data science determine what are our selections going to be, what should our cutoff be, and then go into the actual campaign deployment.
And then, of course, making sure that we do reporting on the back end, getting all that attribution data back in so we can continue to make our campaigns smarter and our model smarter.
Yep. Just to add one more piece here. So, the creative strategy also involves data, most of the time, could give you an example where, for a client, we found that basically sending a creative in, in Spanish to a specific group, specific branch, around specific branches, basically increase the response rate by 2x, not using a model just by changing the creative Right? So, some of those comes out from attribution, right?
We do testing, right? We get that information. We have those demographics. We'll let them know that maybe you could try it, and then they would introduce a cell to test it And once the test is successful, we push forward to X CAMMains with the data.
Am I right?
Yep. Exactly.
Now that we've talked a little bit about what is data science, we wanted to walk you through a couple of examples. So, these are actual client campaigns, we'll keep them anonymous.
But we'll talk about how did we incorporate data science, how are we deploying these camps today, campaigns today, and how are they performing?
So, the first one here, is a mortgage lender who obviously with the past year when rising interest rates and, the growing home equity values of people's home values going up and having more and more value in those homes that they could choose They really wanted to start focusing on HELOCs.
And, what we recommended to them was let's do a custom prescreen model, that we will build specifically for for home equity.
And the reason we wanted to do that is they had a couple of, interesting requirements that we'll talk about a little bit as we went into this. But it allowed us to really identify the most responsive people to this campaign but also the most qualified people, which is going to be key for a credit offer.
And so, to ensure that we are getting to those most qualified people, the first thing we have to do when we're building a model is make sure that we're applying the same under writing criteria or credit criteria, when we develop the model that we do when we run an actual campaign.
And so, in this case, just to give you some examples here of some standard underwriting criteria, you know, if my FICO cut off was 68, we wanted to make sure we were going to drop everybody who had 68 or less as we're developing the model.
Do you have something to add?
Oh, one more thing to say. I just found that one too. And in addition to all of the exclusions we are doing kind of global rejects, we call them. We were also doing some selection criteria.
As I mentioned, you know, they had, some specific requirements around. They needed to make sure that the the loan value had a minimum of fifty thousand for home equity.
But that kind of varied depending on your loan to value was, what your home type was, if you're a single family, multi family, if it was a primary residence or a secondary or investment resident, and then obviously, my FICO Rain that's varied as well. So, that's why we kind of had to build this into a custom model for them.
Yes. And this is front and center of everything that... So, we have to build models on the universe that is the actual universe that the client wants to model. So, if you are going to build a custom client model, we need to understand their underwriting criteria and build a model only on that universe. Because once we move away from that universe, the single is going to be completely different. And you don't want to basically target them at all. So, this allows you to maximize to find more people in this universe.
Moving forward. Okay. So, here comes all the numbers, but I would most probably look at these charts much more easier to follow. Right?
For this client, we wanted to analyze, before we went on to build the model, we wanted to see what the top attributes are, that would basically predict the conversion cure for their hillock product. Right? We wanted to know how the conversion looked like. Right?
So, we did an initial segmentation analysis. We call it profiling in our side, which is automated, by the way. So, once you give the data, it will basically split basically spit all of these informations to us, and it will rank order them by the best. The ones that you're seeing here are the best attributes by itself.
Where you, for example, here, if you had just used installer one, which is non... nothing but number of installments rates a person has, if you basically only targeted people with installed rates, that's greater than seven and above, it's predominantly what, debt consolidation. That we are doing here. People want to consolidate debts to basically get a HELOC. Right?
And what do you end up getting? You would be you would be able to get almost 50% of the universe just in this bucket, right, for, like, the top 50% universe. So, you're able to get almost equal number of targets out just by targeting this group. So technically, if you don't want them if your, let's say your business does not want to do a model, you could basically take this attribute and basically apply this card on your underwriting criteria on top of your writing category.
We call it selection criteria from our end. You could just do this cut and you'd be able to get performance boost immediately, right, because you're removing your nonresponsive universe, but you're maximizing your conversion universe. That. Now, comes the next example.
Let's say next example is aggregate monthly payments that are someone plays on their revolving rates, which is also intuitively, it means that they want to consolidate on their revolving rates. They have a lot of credit card debt, and they want to consolidate that. That's what is the actual intuition behind what this data point is saying. Now if you want to use this data point, you would most probably want to only target people that is $441 and above that if we are paying $441 on a monthly payment, right, and above, and you would be able to get 37% of your targets by just mailing 9% of the universe.
So, you're getting 3x respond to 3x left by just looking at just one segment. Now, this is all just by using without a model, just by using one attribute. This is what the data, the data provides signal from. Now think about if you're able to do this with just one attribute, Think about how a model comes in, it's able to look at different attributes, connect all the dots and give you a perfect model, right?
That's what we did next, right? We basically went and built a Hilock model. Now what you're seeing here is what we call a straining performance, which is nothing but you have your, you have the data. And you train the model on the data that's here, that's the training data.
And this is the performance that the model had on the training data. So, this is something that it's like a book. You have given a given a given a book. You read that book.
Now if I ask questions on that book, you'll be able to answer the book easily. Now, if I basically give you a give you questions from another book, you're going to find it hard, unless you learn it really well. Right? Similar here.
So, what you're seeing here is questions that it learned already, almost, and it's able to basically push 51% of the targets. To the top 10%. So, it's almost like... you mail 10% of your available universe.
You'd be able to get 51% of your targets already there, which is nothing but 5x left on top of, 5x left for your marketing campaign. Which is phenomenal, right? Now moving forward, this is on the holdout sample. Now, what is a holdout is nothing but a brand-new book, which you have never learned, but it also has similar concepts of what you have.
So, it is able to even work on something that he has never seen before. Which is what is going to happen month on month because it's going to see data that's that has never seen before. And it's still able to consistently perform at the same person. Like, it's almost basically having 4.49% on targets, just nothing but five x left on the top.
So, which means that the model is stable. Now, these are HELOC targets. Now what we also wanted to do, Hey, there can be chances that this could be also mortgage conversion. So, assume that a person did not convert for a HELOC, Is there a chance that they would or they would take a mortgage maybe, and we try to use the same model to check if there is a market, they can go for a mortgage product.
And, hey, we found that we have 2x, 2.5x response conversion for a mortgage product, assume that they did not take a HELOC.
You guys, if you have a loan officer there, they could still convert them to a mortgage offer. So, it's double whammy in general, right? So, we basically deploy this model into production - and you want to go about the strategy?
Yep. So, just as we discussed, we had to make sure that we applied that underwriting criterion. So, that was the first step. This was an acquisition campaign, so we applied suppression and make sure that we're, suppressing out any customers, NCOA, make sure we're mailing the most current address. And then once we have that remaining universe, we score it with the model and then decide how deep we want to mail or what the desired mail quality is based on a budget and say, you know, we're going take the top four hundred thousand scores and just be able to just do that easy cut off of the score or a model decile we might take those scores, cut it into ten even deciles. And we might say we're going to grab all of decile one and all of decile two.
And then in terms of performance, this is actually a campaign that just failed, in January. So, we don't have final results yet. But, the really cool thing was is working with this client, we had credit models already built into a daily credit program that we have for them and because they saw the success of that, they were very comfortable with us moving forward, with doing modeling in this patch campaign And then the other thing is all the analysis that Susthaanth and the data science team did actually shows us kind of exactly what to expect. So, we could go into this quite confident that, you know, we're going to get the 2, 3, 4x response rate by just mailing those top deciles.
Alright. Our next example is a checking acquisition example.
In this case, we had a bank that came to us. They wanted to acquire additional checking accounts, but they wanted to do it at the lowest possible CPA.
And so, our approach for this one that we recommended was to do our carrier out model. And what that allows us to do is not identify individuals to mail or households to mail, but we actually look at carry routes that are going to be most responsive, and that allows us to then mail that campaign at a carrier route level, which is the lowest possible, postal rate.
Now, looking at the analysis. So, carrier road model is also built with the same methodology. It's just that the data is aggregated at the carrier root level. So, you have information about each carrier route, kind of what what's the kind of household representation that's available in the carrier and stuff like that.
Right? So now, when we go through this model, we were able to figure out, so, as a part after the model gets after we get the model scores, basically try to validate almost the same kind of analysis here. I don't have a graph to show you, but if you look at here, if you had targeted, this is the model validation that we found out, right, if we post this is post campaign. So, after the campaign was done, we were able to find almost like 5, what is it... 42 basis points, the 400 basis points in terms of response that we were able to get on account response rates, right?
And it's almost similar to what we were able to see in the in the mail in the this is this this output that you're seeing here is on the training data, and we were able to say that the optimal mail quality for you to get the, the ninety nine person response accuracy, that's almost like one, almost 0.1% in response, right, is by cutting at what was that? Six eighty as your, as your top, as your male quantity. And you are able to see that at almost six eighty, when you when you add all of them together, we would be almost roughly at a half sent, right, there.
So, which this is this basically drives the analysis of what we did, and we were able to prove that out in the deployment as well. This is this is come these outputs are coming from an actual campaign. Right?
These ones are coming from the actual campaign. Right?
Moving forward to the next slide.
So, on this deployment, so we did the carry route model. We scored it. We selected the carry routes we were going to mail. But we had another extra wrinkle. This client has specific LMI and majority, minority requirements that they wanted by region.
And so, they asked us to make sure that we were hitting those minimums by region. So, we rerun it like we normally do. We score it. We pick who we're going to mail, and we just double check those quantities by region. And if we see that, we're not quite hitting the level that we need to, then we have to go deeper in the model on that particular region.
Pull some additional records up, our additional carrier routes up, and then until we kind of get to that right balance. So, kind of combination carrier routing and some additional forcing where needed.
So, this one, right? So, this is almost like a selection problem. Right? So, what technically we have done is on top of modeling, we also have, a supply chain optimization problem. That's what you say. Right? We try to optimize the male quantities such that we are able to assign enough deltas, right, so that we hit our target limit on the dot, right?
And it and we need to kind of give it a bit more because sometimes, when it goes through the final NCOA and all those processes, things could drop and then the mail, mail volumes could drop below the ones, and we have to do it again. So, we tried to give we tried to be at least a bed up on top of at least a one person up on top of the numbers that's provided.
And here, I'm just going to add a little plug for another solution that we have called postal select. That wasn't really part of this campaign, but it's the ability to take a carrier out model where we're trying to optimize that that postal cost with response and then also do household modeling on the same universe. And what we can do is grab as many records as we can that are going to, hit that response level that we want at the carry route level, and then backfill in, some additional high ranking household model folks. And so, that way, we can really, you know, optimize across both of those models, keep the postal costs as low as we can. And in this Example on the on the side there, is the client that we did, move them over from, from carry route to this hybrid approach And, we were able to bring their CPA down by fourteen dollars, and that actually has continued to drop even past, this twelve-month period.
So, just to add to that, right? The household model basically targets household, which means it's a brand-new model. It's a different model that we are looking and the carrier road model is a different model. Now, both of these models, right, are not built every month.
Right? These are basically built once, and we continuously monitor them, right? And as the as the as the performance is is good or is getting increased, we are fine with that. If it goes down, we immediately interject and we basically rebuild the models or retune or, I would say, retune the models and try to push it back in that we continuously maintain the same ROI or a better ROI that we assign for.
Alright. Our third Apple year as an unsecured loan acquisition.
And this was actually a mortgage lender who, obviously, again, with the high interest rates that we're seeing this year, and the growth that is happening in the personal loan space wanted to, refocus their efforts to, to that part of the market.
And, what we I'm sorry.
Our our recommendation there was not just to do a single model, but to do three different models. So, looking at response, conversion, and denials, and bringing them all together in what we call an ensemble approach.
And, in this case, we definitely needed the clients first party data. They needed to give us who are they, receiving in terms of applications, who's getting approved, who's getting denied so that we could build these individual models.
Now, so, the reason, maybe I'll give you the reason behind why we had to do the response conversion and denial model in the first place, right? So, for it... basically you're trying to maximize the people that go into the funnel. So, in order to maximize the people that go into the funnel, you need the response model. You need to find the people that are highly responsive, right?
The flip side is the people who are highly responsive do not convert. Right? It's very hard for you to find people that are responsive and are ready to convert, right?
And even after you find that, there are situations where those could be risky pockets that we have found, and they would be, they would basically default after they basically took a loan. So, that's the first payment default data that you may have to use depending on the metric that the client uses. We need to also try to take that into account such that we find people that will convert and that are not risky, right?
And in order for us to maximize this whole funnel, we had to build three models, three levels of models that will go on to the prescreen audience, try to apply all those three models in, in the same file and try to find the people that are highly convert, that will convert, that won't be, that won't default in the in the past, and also would respond, likely to respond.
Now I'm going to go into each model because now each of these is is one single model. Right? Response, we have one model, for denial, we have another model, and for conversion, we have another model. Like, as I said, models can do only one thing properly.
So, we need to, so... this model is going to only do a response so that it learns a lot about that only. Now when we basically this is the final model scores, that we found. And as you see here, right, as the as the model scores are high, high model scores like FICO, highest model score is the better, lower ones are bad, right?
As you see here, as the model scores are high, that's top 10%. These are all broken down by 10%. As the model scores are high, you see the number of people that respond is high.
So, the model is able to do a really good job of pushing all the people that would basically respond to your offer to the top. And when you look at here, you're looking at a left of, let's say, you could go all the way down to the second, which is nothing but 20% of your universe, right, and still get a lift of almost 2.5 lift. It's almost a 2.5% over if you had mail randomly, right, which is, which is perfect, right? Now, moving forward to the next piece is the conversion model.
Now, as you see, when when you when you enter the conversion model, it started to drop. The reason is the signal for doing the conversion has it has an extra signal of also preapprovals.
Because a person who will respond may come into the funnel and they may be declined during the underwriting criteria, and they would have been it would have been a lock and then move into a fund. So, we have one extra layer that the model has to learn on top of the conversion. Right? So, as you see here, from a conversion standpoint, the higher the score, you see top two n tiles are around when you come when you combine them, top 20%, you would get a lift of 1.5 times, whether before more than what you mail at random.
Next comes the denial model. Right? Now, for the denial model, right, the, the lower the scores, the better the model, right? So, what we try to do is we try to, we try to remove the top.
Okay. I think it's the same. So, higher this course Since this is built on denials, higher the scores, they will default. That's what it is.
So, we would need to basically remove. So, what we did here was we removed the top three buckets because those were the people where a lot of denials happened in terms of lift. We are looking at when you compare both of them, averages around to around two-ish in terms of lift. So, we removed all the people that fell on our top three intel buckets.
So, that we reduce the risk. Still there was risk, you see, coming from the bottom intels. So, if you want to go lower on risk, we could furthermore go down, right, but your universe also starts to shrink.
Now, we have three models with us. Now, somehow, we need to join all of them together. Right? So, what we ended up doing was we waited the response model, the conversion model, and we removed the people that fell on the top three enthalves of the denial model, rather than waiting the denial model as well, we wanted to just eliminate them because the client was averse to risk. So, we removed that and, in the end, we have an ensemble approach where we are using three models All the information from three models are coming together, and it's basically driving the final output, which is the people that we want to make.
Wan to take the strategy?
Yep. So, in deployment, we had to do have the same underwriting criteria that we talked about on that other model that had to be applied first Make sure that we're dropping out anybody who wouldn't qualify on the back end. We applied suppressions and NCOA.
And then we did that ensemble model are the scoring of the ensemble model, and, adding together the conversion and the response, dropping off those top two deciles of, the denial.
And then, again, have that remaining universe that we can select, up to the desired mail quantity.
And in this case, performance, this is the initial campaign that went out. We had three different creative tests going on at same time. So, as Sushaanth was saying, you know, we try to incorporate those as well, make sure that we can read those on the back end.
And I won't go through this line by line, but, you know, the client was very pleased with what we saw in terms of response and, they are they are continuing to mail this ongoing.
So, this was, I would say, a before peak campaign. So, the more funds we get, the more money we get in general, at the same the client, right? But what they wanted was they wanted to maximize the loan balance. So, they want something that is greater than ten thousand dollars no matter what. And we are trying to maximize this as much as we can right now because we are also trying to build another model that is tuned towards finding people that will open with a higher load balance. That's something that we are working on. So, it's, though we have the models built, the campaign is stable, still be continued to try and basically enhance the ROI that we're able to give to the client on an ongoing basis.
Think we need to shift to the end. We had one more example, but I don't think we have time to cover it.
Okay. Maybe I'll just go through. This is a retention piece.
Yeah.
In general.
Another model that was used for mortgage retention.
So, maybe you could just talk about just the strategy that we used?
Yep. In this case, we built the model similar to what we had described on some of the others. We were doing both mortgage and HELOC in this case. So, depending on where you're selecting the top deciles and sending out a mortgage offer or a HELOC offer, depending on where you scored.
And again, we don't do reporting on this pain within Deluxe. The client does their own reporting, but they have continued to mail this for the past year and have been very pleased with the results.
So, thank you for joining us. I think he just flipped to the next one.
These are some of the remaining events that you might be interested in, of course, the party tonight, but there is a reception down in the Colab from four to five.
And if you have any questions, we'll be up here. Yes. Thank you.
Maybe you can take a couple of questions if you have.
Yeah. I guess we got two minutes.
If you have a question for how does your model?
So, when you build a model, right? And I saw those date as of 2023, we had engaged the month. So, you on some sort of, you know, feedback. Yes.
So, the feed... so, we continuously monitor the models, like I said, right? We create targets ourselves. So, we continuously look at the performance. When the performance goes down, it immediately lets us know.
Right now, we have not finished the complete automation. The AI pays is not fully done. But what happens is we get intubation. So, we go in, we rebuild the model or tune the model back again and push it in, and again, it's the same thing.
We wait. It's not real time. It takes two days for us to turn around.
Yeah.
Yes. Yep. Yes.
Thank you. If we're getting the data, from the client.
If you're getting the data from the client. So, it depends on the client.
Yes. Any other questions?
So, we had two approaches. One was a heuristic nature approach where we looked at when we try to look at, basically, put those scores together and try to do, an analysis on how much lift each of those provides. So, that was one way we manually took a cut using a statistical analysis. Another one is we try to build a decision tree, a small low level decision tree to see how the cuts were assigned.
And then based on what was the genie values that are showing at the bottom, we were trying to create a multiplier on it so, that we could apply that multiplier on the top. So, we used two approaches and we'll try to marry both of them together to add it. But it was it was not automated. It was something it was very specific to the client, so, we have to... that's it's that's why custom models are different.
So, we it was specifically done for the client. So, we had a person who actually built that ensemble approach. Done.
Anything else? Alright. Thank you.
Thank you.
Hundreds of millions of terabytes of data are created every day, from Google searches and social scrolling to shopping and streaming. When it comes to extracting value from this data so you can make better customer connections and optimize your campaign outcomes, is the juice worth the squeeze? The answer is a resounding yes, with the right data partner. In this session, we’ll reveal how Deluxe’s data science team uses predictive modeling, machine learning and AI to analyze raw customer data and transform it into high-performing campaigns. You’ll see how data drives deeper audience engagement through personalized content, delivered at the perfect time.