Sunday, May 24, 2020

Your bluetooth is bad (continued)

More example of how to make bad Bluetooth devices

What format is my number?

Bluetooth developers continue to create a bewildering variety of undocumented data formats. Indeed, it's a race between them and the bizarrely incompetent "Distributed Ledger" teams for the worse numerical formats.

  • Nordic Thingy -- you can't just describe a value as a "uint_16"! Specifically, you can't do it because I don't know if you mean a big-endian or little-endian value (as it turns out, the Nordic Thingy is little-endian)
  • Also Nordic Thingy -- describing a temperature as a signed value and an unsigned fraction only works if you tell me the denominator of the fraction. Some devices might reasonably make the "fraction" part be out-of-10 to get a degrees in a tenth of a degree (which is just fine for a weather station), or it might be out-of-100 or even out-of-256. 


Your Bluetooth is Bad, continued

Very silly Bluetooth protocols, continued

Bluetooth devices continue to astound me: there are clearly capable programmers who manage to completely misunderstand what they're doing. In this episode: the Elegoo Mini-Car kit.

The actual kit is pretty decent: it's got a lot of parts and polish for something so inexpensive (I think I paid about $30 for mine). Up on the Elegoo downloads site there's a 400+ megabyte (!) download with a bunch of well-written instructions and even a Windows 7 driver program.

Clearly the Elegoo people are doing well in terms of creating a complete package.

But then there's the Bluetooth. They obviously switched at some point from a purely serial connection to a BLE connection; this always causes weird issues. 

How should you interpret a command description like this: {RGB[R][G][B][N][T][M]} ?

A smart person would start to figure that it's a weird combination of text and hex, and would kind of wonder how the command RGB is separate from the obviously hex parameters. But no, that's not it. You're supposed to literally type in the "{" and the square brackets. The parameters are in text mode, and they really do have all of the [ and ]. A complete command looks like this:

{RGB[255][0][0][2][0][1]}

A smart person will also realize that because of the insane overhead of the command, it no longer fits into a 20-byte send; you have to split it into two sends!

Let's analyze. Each parameter when sent in binary would be one byte; in fact they will be between 3 and 5 bytes. The command code, instead of being a nice simple byte (there are only about 12 commands actually supported) is instead up to 5 bytes (the MOVES command).

Wait, it's worse. What happens with {BEEP[0]}? Answer: the beeper should stop beeping. What about {BEEP[00]} -- a 00 should be parsed as a zero, right? Wrong; zero is a special case. Only a 0 counts as a zero; 00 is not a zero. This is unlike some of the other commands.

The programmer has managed to make a language that's hard to parse, inconsistent, and wildly inefficient. 


Sunday, May 10, 2020

Entity Framework (EF), SQLite and UWP

What I learned by using Entity Framework (EF) in a UWP application

I'm writing a super nifty e-book reader program (because, why not?). Along the way I discovered
that Entity Framework (EF), the easy-to-use object-relation manager layer, is in fact a poor match to UWP.

How so? Let me count the ways!

Bundled and non-bundled SQLite

Let's start with this page: Getting Started with EF Core.  In it, we discover that there's two ways to use SQLite: we can either use the SQLite that's bundled in with Windows. Or we can, for no good reason, bundle in a copy of SQLite into our app.

Except: using the bundled SQLite, is actually a terrible, terrible idea. Why? Because thanks to other libraries, chances are you'll already have a Sqlite in your app. And using the NuGet package to use the bundled Sqlite in fact just causes validation errors down the line when you switch to Release mode to make an actual release.

Tracking and Non-Tracking


Moving on to tracking: when you build an EF query, you can be either tracking or non-tracking. The value to non-tracking is described here: a non-tracking query is faster. Except that it really, really isn't, especially in my scenario.

Here's the scoop: I'm building a book database. There's a list of books; each book has some "flat" per-book information (like a string for the title) and also has both embedded information (like an entire class with the book's current download status) and lists of embedded information (like the set of people associated with the book aka the author, publisher, illustrator).

When you make a query for books, you only get the "flat" data. If you want the embedded classes like the download data, you have to include that in the query with a ".Include(DownloadData)" in the query.

When I search, I want the search query to be as slim as possible. Every embedded thing that's added just makes the search slower (and it's already too slow!). But EF has a hidden saving grace: you can do a query with a "slim" search, and then for individual specific items, do a thicker query. EF will automatically return the same actual object for the thin and thick queries, just with most stuff filled in.

That's a total win! I can slim down my queries as much as possible to find books for the user, and then plump them back up when I'm displaying specific individual books.

Yeah, but remember that advice about tracking and no-tracking? Turns out no-tracking queries don't do any of the above. And they aren't actually any faster (at least as far as I can tell).

Release Mode is Really slow

This took a long time to figure out: for a while in my e-book app, pressing the "next page" button was slow. And it was 100% my fault; I did a little bit of investigation (following several red herrings along the way!), and made it totally zippy.

Then I compiled in Release mode. Release mode is normally much faster than Debug mode, but (surprise!), that's now how EF rolls. Release mode is very much significantly slower than Debug mode.

There's tons of information about this on-line, most of it too old to be useful.

And then I discovered that my next page button was taking 15 seconds and spiking the CPU to 25% usage. That's completely unacceptable for an e-book reader; next page should be 0.1 seconds at most. My assumption, of course, is that my fixes (see earlier) were the cause, but they weren't.

All of the slowdown was simply saving the current book position. It turns out that one of the things that really slow in Release mode is saving database changes to the database. Like, 15 seconds slow.

Change Tracking (INotifyPropertyChanged) is fast

The SaveChanges advice is a real winner. It turns out that SaveChanges is super slow in release mode even though it's blink-of-an-eye in debug mode.

Using the default SaveChanges strategy, AFAICT EF has to keep a shadow copy of all object, and then when you do a SaveChanges it has to traverse everything to see what's changed. Then it can write out just the changes.

I don't know why this is fast in Debug mode and slow in Release mode.

With the change, EF can track all of your object changes; database updates are then super fast.

The only downside is that the advice doesn't actually say how to set the ChangeTrackingStrategy. You do that by created an override of the OnModelCreating method in you DbContext class.

   protected override void OnModelCreating(ModelBuilder mb)
   {

     mb.HasChangeTrackingStrategy
         (ChangeTrackingStrategy.ChangingAndChangedNotifications);
   }


Understanding that there's queries and there's queries

A surprise (to me) in EF is despite this not being documented, it's really important to know where a query is happening. EF is an abstraction layer on a database; you have to know what is happening on the database (an .AsQueryable()) versus what's happening in the EF layer (an .AsEnumerable()).

Need to make sure that your search is returning a "plump" object? That has to happen on the Database side, not the EF side. But need to do a complex search (like people do!)? That's strictly on the EF side of the house. Searches are faster on the Database side, but on the other hand, they are also super flaky.

But wait, it's just a little harder. EF converts queries at run-time. It's easy to make a LINQ query that looks right, and matches a simple, intuitive understanding of how everything is working, but will fail at run time.

What's all this about a flaky search?

Sometimes people want to find stuff by an author ("Dickens"). To do that, you make a search string ("%Dickens%") and then use EF.Functions.Like(b.Title, likeSearch) for the compare. Unless, of course, your search string is blank; if you do that, then the likeSearch string is plain "%%" and for some reason, EF (or maybe SQLite) really, really hates search strings like that.

As matching for "%%" is hard to understand. Yes, in general you shouldn't do weird searches. But this one is pretty simple and unambiguous.

How big is my database?

The main book database starts off with 60K entries (because I'm including the entire Project Gutenberg catalog! It's awesome!). Each entry has 6 sub-entries (list of People, list of Files, Review, DownloadData, NavigationData and a list of Notes).

A typical book has about 4 People and 12 Files associated with it. The database when compressed is 32 megabytes and expands to 158 megabytes when installed.

Getting started: making the database

To make your first database, EF needs to scan your app, looking for the data context so it can build a bunch of scaffolding. This doesn't work with a UWP app. Your solution needs a second project which exists only to be a regular .NET app that EF can figure out.

Yes, this sucks.

Things to avoid: splitting the database


My e-book program has a big, read-only database of all Project Gutenberg books. First thought: use two database files, one for stuff that really is read-only, and then use a different database file for everything that the user enters.

So a book title, and the files that can be download is database #1, and the notes the user wrote is database #2.

Don't do this. EF has no support for it, and it will only make your life more complex without being actually useful. Stick with a single database.

TL/DR: things to watch for with EF, SQLite and UWP

  1. Run in Release mode early; it will catch surprise performance issues
  2. Track which things are Queryable (database) and which are Enumerable (EF)
  3.  Use the embedded database, not the built-in database
  4. Don't bother with .NoTracking
  5. Absolutely use INotifyPropertyChanged and ..Changing to go fast
  6. Complex searches and queries should be run on the EF side