Using Sqlite as a store for a long running process

March 16, 2023

Recently, I’ve tried using Open AI to get some data, and it worked quite well; however, one of the issues was the time that it took. For a process like this, you may think (as I did) that the simplest approach was to store everything in memory. However, with the process running, stopping, crashing, etc, I quickly realised that I was going to need to cache the data.

I decided to use Sqlite for this. I’ve written about Sqlite before. It tends to work really well in situations where you don’t really need a database, but a persistent storage (i.e. a glorified file system); for example, I’ve used it with Hangfire (well, by proxy).

In this post, I’ll cover adding Sqlite to a console application; we’ll use EF as the ORM. This follows on from this previous post where I demonstrate how to add DI to a console application.

Packages

If you’re using Entity Framework Core, then you’ll probably only need these packages:

Install-Package Microsoft.EntityFrameworkCore.SQLite
Install-Package Microsoft.EntityFrameworkCore.Tools

You actually don’t need the Tools unless you want to do things like add migrations / update the DB, etc…

DbContext

The next step is to add a DbContext; here’s a sample one:

public class MyDbContext : DbContext
{
        public DbSet<MyData> Data { get; set; }
        public DbSet<MyData2> Data2 { get; set; }

        protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        {
                optionsBuilder.UseSqlite("Data Source=database.db");
        }

        protected override void OnModelCreating(ModelBuilder modelBuilder)
        {
                modelBuilder.Entity<Data>()
                        .HasKey(b => b.Id);
        }
}

There’s a couple of things to unpick here; firstly, UseSqlite("Data Source=database.db") will create a file called database.db in your output directory, and that is the database. The other thing that we should probably point out is that the code inside OnModelCreating is redundant, as anything with an Id field is considered a key by EF by default.

To add the DB context, you need to call AddDbContext in the startup; from the previous post, that may look like this:

using IHost host = Host.CreateDefaultBuilder(args)
    .ConfigureServices(services =>
    {
        services.AddSingleton<MyClass>();
        services.AddDbContext<ExtractDbContext>();
    })
    .Build();

Accessing the DbContext

Finally, in the class, you can simply inject the DbContext:

public class MyClass
{    
    private readonly MyDbContext _dbContext;

    public AnalyseData(MyDbContext dbContext)
    {

That’s it, you can now read and write from that Sqlite DB inside your console app.

“What about the Long Running App bit that you lured me in with?”

Okay, so, obviously this depends on what you’re doing, but let’s imagine that you’re trying to write a file with hundreds of thousands (or even millions) of rows; one option is you can treat the Sqlite DB as though it were that file - so create a table that just has two columns: Id and Output and then simply write everything to that table. It’s worth bearing in mind that the Id column in this case would be purely for keeping the data in the correct order.

My preference here, though, would be to re-think the process: store all of the source data in a table, then run a second process to produce the output data to a second table, then a third to write the output. That way, you can stop the process at any stage, and simply resume from where you left off.



Profile picture

A blog about one man's journey through code… and some pictures of the Peak District
Twitter

© Paul Michaels 2024