Jan 12

What to consider before starting an AI project in your company?

Artificial intelligence (AI) can have a significant positive impact on your business operations, but only if done right. Every successful AI project is brought to life through carefully thought-out and executed stages. The effort you put into defining your use case, sourcing data, building and fine-tuning your machine learning model will ultimately determine its performance. So before you kick off the next AI project in your company, here is what you need to consider.

What is your use case?

In other words, where to begin? The first step in AI development is defining the use case. A strong start will help determine and set the tone for what you are trying to achieve and align your team’s efforts. You must have a clear goal:

What are your business’ pain points?
What are you trying to improve with AI?
How will AI help solve this problem?
What is the aim of this project?
What resources will be allocated for this project?
What are the success criteria for this project?

If you can answer the questions above with confidence and clarity, it will help you scope the project and get on the right track from the get-go.

Remember to use an iterative approach by breaking down your work into smaller steps. For example, to build a customer service agent, start by creating a solution that forwards customer contact information to the right person before trying to have the agent solve the problems independently.

What training data do you need?

Data sourcing is by far the most crucial stage in AI development. How well (or not) your machine learning algorithm will perform will depend on the training data and its annotation quality. You want to avoid the unwelcome ‘garbage in, garbage out’ scenario.

Start by defining your process for getting the necessary data. Continue with the iterative approach and divide this process into smaller steps:

Decide on what your datasets should look like in terms of type and quantity. For example, is it speech data you require? Do you need thousands or millions of voice recordings to achieve the desired result? Start small, see how it goes, learn from it and adjust your data requirements.

Once you have determined your data needs, find a data partner. Make sure to agree they can provide data that matches your use case and decide on specifics such as data project management, how and when the data will be delivered, etc. before you negotiate and finalize the deal.
Do an update here when the survey is out, quoting that high performers tend to work with third parties regarding data.

How will you build and train your AI model?

Similarly to data sourcing, AI model building and training should be a step-by-step process. Consider aspects such as:

What technology will you use?
Which algorithm will you be working with?
Which platform will you choose?
Will you build the model in-house or work with a partner?
What is the training process going to look like?

If you’ve got the above figured out, define the features of your model. Unnecessary features can negatively affect accuracy, so you should only use components relevant to the model and your use case. Use tools and algorithms to help measure and remove unnecessary model features when needed.

person doing coding on laptop with codes on screen

For easier result measuring, ensure that each model iteration is versioned and compared using the same data. Plus, when choosing the algorithm, consider if it will allow you to interpret the output without jumping through hoops.

“AI model accuracy is closely connected to the quality of training data. Be specific and clearly define the steps you will take to collect data. A well-thought-out process for data sourcing can shorten your project by half a year.” - Thomas Forss, Co-founder and CEO, StageZero Technologies.

How will you test your AI model?

During the testing phase of AI development, you want to have some gold standard data to test against. Gold standard data is your perfect, correct dataset. Test each version of your model against this validation data to see how you are progressing, and keep iterating until the model performance improves.

Edge cases, in particular, are where AI models struggle. Edge cases are rare happenings for which no data in the training dataset exists, such as, for example, a bird covering the license plate just as a truck drives past the gate. You will need to add more data to resolve such cases, and these cases will be hard to imagine before the product is live, which is why iterative development is recommended.

In speech recognition, one of the most common reasons models underperform is forgetting to include voice data with different accents. If a model is only trained with British English natives, it may not work when someone is speaking English but with a Spanish accent.

How much should you test? It will depend on your use case. If we take a chatbot, 70% accuracy may be enough, but if it is a self-driving car - you want to aim for 100% for obvious reasons. The main goal of testing is to prepare the model for deployment.

Are you ready for deployment?

As you prepare your AI model for deployment, consider the most efficient way to do so. Some projects might face regulatory issues. For example, because medical records in Finland cannot be processed elsewhere, a healthcare project may have to accommodate this.

In most cases, you can use a cloud provider to deploy your AI model. But if the project is large-scale, it might make sense to build your own server infrastructure or use a bare cloud solution and build your setup on top. There can also be examples when you might only deploy locally: perhaps the model contains particularly sensitive company information or has no reason to be connected to the internet.

Evaluate whether you have achieved your KPIs and the goal of the AI project. If set parameters are not met, adjust/replace the model or improve the quality/quantity of the training data. Upon meeting all defined parameters, deploy the model into the intended setup.

Monitor and adjust for optimal results

Set up a monitoring system to ensure your model is working as intended. A continuous model iteration is needed to respond to technology, business, or data changes. Regularly test output against the gold standard data and update the model with new data to ensure it still fits the use case.

Expect that the model will behave differently when deployed in the real world. Pay close attention to irregular decisions or deviations from the pre-defined accuracy of the model. When the model fails above your set threshold or does not adhere to the set parameters, make necessary adjustments and fine-tune for optimal results using new data.

What’s next?

If you have decided on the use case, StageZero Technologies can help you with the next step - data sourcing. We offer an entire library of pre-collected and custom speech datasets.

If you are still trying to figure out what you need, reach out to us, and we will help you assess which datasets would work best for your project.

Share on: