@darrel_miller @vincentbiret I think @kin and @plgah are using a different dataset than I was, a fresh harvest. In either case, it’s fairly large, several thousand at this point. How would you want to consume that? Ideally, it would be de-duped, but even determining parameters for what makes a duplicate is an interesting exercise.