< March 2021 >
MonTueWedThuFriSatSun
01020304050607
08091011121314
15161718192021
22232425262728
29303101020304

Wednesday, 03 March 2021

11:00 PM

sourmash 4.0 is now available! Low low cost if you buy now! [Living in an Ivory Basement] 11:00 PM, Wednesday, 03 March 2021 01:20 PM, Thursday, 04 March 2021

So, we just released sourmash 4.0, our Python- and Rust-based open source tool for k-mer sketch-based analysis of metagenomes and genomes.

The high notes of this release are -

  • much better user experience design around creating and storing sketches;
  • removal of several obsolete features that were holding us back;
  • improved default Python API;

but they really don't particularly matter, to be honest :).

What's most cool about this release is...

Semantic versioning, feature compatibility, and deprecations

...it's a release where we purposely broke compatibility with previous versions, and went through a whole deprecation effort, and documented it all.

We use semantic versioning for sourmash. What this means is that major versions of sourmash (v3, v4, etc.) can break backwards compatibility, but minor versions (v3.1, v3.2) cannot. In practical terms, it means that when you use sourmash in a workflow or application, you can pin your software install to the major version without worrying about breakage - e.g. specify sourmash >=3,<4.

In the case of v3.x and v4.0, we systematically upgraded and improved sourmash performance and features during 3.x, and reserved breaking features for 4.0. Further, we added warnings and deprecations to v3.5 about features that were going to break in v4.0. Then we wrote a migration guide.

It was a lot of work! I probably put 40-80 hours into just this aspect of things over the last three months.

So... why did we do it?

Why did we do all this work?

We're not really sure how many people use sourmash outside the lab, but in the lab, we use it quite a bit. It's a pretty effective Swiss army knife tool for hacking and slashing at sequencing data, and a lot of basic questions about taxonomy and k-mer content can be answered with a little creative sourmashing.

And we have 5-6 workflows and pipelines that rely on sourmash.

So the first answer is that we did it for ourselves, so that we could robustly rely on sourmash in our workflows.

But the more complete answer is that we wanted to go through the semantic versioning & deprecation workflow and user communication/documentation stuff, so that we could just bake it into project expectations (for this project and for others). And we wanted to do this because we think this is the right way to do scientific software, and we wanted to communicate our expectations about changing sourmash behavior clearly and unambiguously.

What do we get out of it?

We're signaling to our current and prospective user base that we are open to their concerns. This results in improved user communication: for example, after our first release candidate we got some feedback that caused us to explicitly note that (1) numerical results shouldn't change and (2) old sourmash databases are still compatible with the new version.

We're providing a path to ourselves, as well as future developers (and future us), on how to think about pacing our changes to the software. On the one hand it's frustrating to delay cool and important changes to the software because we're not yet ready to release a big version; on the other hand, we took the time to more completely bake some of the new features and did several rounds of documentation improvement.

And, frankly, I think we ended up with better code reviews and development processes internally, because we had to think explicitly about how each particular change would impact users. (FWIW, our best guess is that we have about 1,000 users.)

What are the downsides?

Well, it was a lot of work :). And investment in the future of an academic software project is always a gamble!

Also, we don't have the person power to maintain multiple releases of sourmash, so it does mean we're more or less abandoning people who want to continue using sourmash v3.x. We didn't break any particularly big features, but it does require effort on our users' side to upgrade, so maybe some people will hold off because of that. And while I'll backport fixes to really important bugs if we have any in the next few months, we don't intend to backport performance improvements or new features. So maybe users will suffer a bit from that.

What's next?

We have a lot of new features that will probably come out in v4.1 and beyond, now that we can switch our efforts to that! Lots of exciting stuff is coming in the areas of protein k-mers and massive-scale database search!

I'm already looking forward to v5, where we can remove some of the features that we deprecated for v4.0.

And I think it's more than time for a new JOSS paper...

--titus

Feeds

FeedRSSLast fetchedNext fetched after
XML 12:00 AM, Tuesday, 18 January 2022 12:30 AM, Tuesday, 18 January 2022
Bits of DNA XML 12:00 AM, Tuesday, 18 January 2022 12:30 AM, Tuesday, 18 January 2022
blogs.perl.org XML 12:00 AM, Tuesday, 18 January 2022 12:15 AM, Tuesday, 18 January 2022
Blue Collar Bioinformatics XML 12:00 AM, Tuesday, 18 January 2022 12:30 AM, Tuesday, 18 January 2022
Boing Boing XML 12:00 AM, Tuesday, 18 January 2022 12:30 AM, Tuesday, 18 January 2022
Epistasis Blog XML 12:00 AM, Tuesday, 18 January 2022 12:30 AM, Tuesday, 18 January 2022
Futility Closet XML 12:00 AM, Tuesday, 18 January 2022 12:15 AM, Tuesday, 18 January 2022
gCaptain XML 12:00 AM, Tuesday, 18 January 2022 12:30 AM, Tuesday, 18 January 2022
Hackaday XML 12:00 AM, Tuesday, 18 January 2022 12:30 AM, Tuesday, 18 January 2022
In between lines of code XML 12:00 AM, Tuesday, 18 January 2022 12:30 AM, Tuesday, 18 January 2022
InciWeb Incidents for California XML 12:00 AM, Tuesday, 18 January 2022 12:30 AM, Tuesday, 18 January 2022
LeafSpring XML 12:00 AM, Tuesday, 18 January 2022 12:30 AM, Tuesday, 18 January 2022
Living in an Ivory Basement XML 12:00 AM, Tuesday, 18 January 2022 12:30 AM, Tuesday, 18 January 2022
LWN.net XML 12:00 AM, Tuesday, 18 January 2022 12:30 AM, Tuesday, 18 January 2022
Mastering Emacs XML 12:00 AM, Tuesday, 18 January 2022 12:30 AM, Tuesday, 18 January 2022
Planet Debian XML 12:00 AM, Tuesday, 18 January 2022 12:30 AM, Tuesday, 18 January 2022
Planet Emacsen XML 12:00 AM, Tuesday, 18 January 2022 12:15 AM, Tuesday, 18 January 2022
RNA-Seq Blog XML 12:00 AM, Tuesday, 18 January 2022 12:30 AM, Tuesday, 18 January 2022
RStudio Blog - Latest Comments XML 12:00 AM, Tuesday, 18 January 2022 12:30 AM, Tuesday, 18 January 2022
RWeekly.org - Blogs to Learn R from the Community XML 12:00 AM, Tuesday, 18 January 2022 12:30 AM, Tuesday, 18 January 2022
The Adventure Blog XML 12:00 AM, Tuesday, 18 January 2022 12:30 AM, Tuesday, 18 January 2022
The Allium XML 12:00 AM, Tuesday, 18 January 2022 12:30 AM, Tuesday, 18 January 2022
Variance Explained XML 12:00 AM, Tuesday, 18 January 2022 12:30 AM, Tuesday, 18 January 2022
January 2022
MonTueWedThuFriSatSun
27282930310102
03040506070809
10111213141516
17181920212223
24252627282930
31010203040506
December 2021
MonTueWedThuFriSatSun
29300102030405
06070809101112
13141516171819
20212223242526
27282930310102
November 2021
MonTueWedThuFriSatSun
01020304050607
08091011121314
15161718192021
22232425262728
29300102030405
October 2021
MonTueWedThuFriSatSun
27282930010203
04050607080910
11121314151617
18192021222324
25262728293031
September 2021
MonTueWedThuFriSatSun
30310102030405
06070809101112
13141516171819
20212223242526
27282930010203
August 2021
MonTueWedThuFriSatSun
26272829303101
02030405060708
09101112131415
16171819202122
23242526272829
30310102030405
July 2021
MonTueWedThuFriSatSun
28293001020304
05060708091011
12131415161718
19202122232425
26272829303101
June 2021
MonTueWedThuFriSatSun
31010203040506
07080910111213
14151617181920
21222324252627
28293001020304
May 2021
MonTueWedThuFriSatSun
26272829300102
03040506070809
10111213141516
17181920212223
24252627282930
31010203040506
April 2021
MonTueWedThuFriSatSun
29303101020304
05060708091011
12131415161718
19202122232425
26272829300102
March 2021
MonTueWedThuFriSatSun
01020304050607
08091011121314
15161718192021
22232425262728
29303101020304
February 2021
MonTueWedThuFriSatSun
01020304050607
08091011121314
15161718192021
22232425262728
November 2020
MonTueWedThuFriSatSun
26272829303101
02030405060708
09101112131415
16171819202122
23242526272829
30010203040506
September 2020
MonTueWedThuFriSatSun
31010203040506
07080910111213
14151617181920
21222324252627
28293001020304
July 2020
MonTueWedThuFriSatSun
29300102030405
06070809101112
13141516171819
20212223242526
27282930310102
June 2020
MonTueWedThuFriSatSun
01020304050607
08091011121314
15161718192021
22232425262728
29300102030405
May 2020
MonTueWedThuFriSatSun
27282930010203
04050607080910
11121314151617
18192021222324
25262728293031
April 2020
MonTueWedThuFriSatSun
30310102030405
06070809101112
13141516171819
20212223242526
27282930010203
February 2020
MonTueWedThuFriSatSun
27282930310102
03040506070809
10111213141516
17181920212223
24252627282901
January 2020
MonTueWedThuFriSatSun
30310102030405
06070809101112
13141516171819
20212223242526
27282930310102
December 2019
MonTueWedThuFriSatSun
25262728293001
02030405060708
09101112131415
16171819202122
23242526272829
30310102030405
November 2019
MonTueWedThuFriSatSun
28293031010203
04050607080910
11121314151617
18192021222324
25262728293001
October 2019
MonTueWedThuFriSatSun
30010203040506
07080910111213
14151617181920
21222324252627
28293031010203
August 2019
MonTueWedThuFriSatSun
29303101020304
05060708091011
12131415161718
19202122232425
26272829303101
July 2019
MonTueWedThuFriSatSun
01020304050607
08091011121314
15161718192021
22232425262728
29303101020304
June 2019
MonTueWedThuFriSatSun
27282930310102
03040506070809
10111213141516
17181920212223
24252627282930
May 2019
MonTueWedThuFriSatSun
29300102030405
06070809101112
13141516171819
20212223242526
27282930310102
April 2019
MonTueWedThuFriSatSun
01020304050607
08091011121314
15161718192021
22232425262728
29300102030405
March 2019
MonTueWedThuFriSatSun
25262728010203
04050607080910
11121314151617
18192021222324
25262728293031
February 2019
MonTueWedThuFriSatSun
28293031010203
04050607080910
11121314151617
18192021222324
25262728010203
January 2019
MonTueWedThuFriSatSun
31010203040506
07080910111213
14151617181920
21222324252627
28293031010203
December 2018
MonTueWedThuFriSatSun
26272829300102
03040506070809
10111213141516
17181920212223
24252627282930
31010203040506
November 2018
MonTueWedThuFriSatSun
29303101020304
05060708091011
12131415161718
19202122232425
26272829300102
October 2018
MonTueWedThuFriSatSun
01020304050607
08091011121314
15161718192021
22232425262728
29303101020304
September 2018
MonTueWedThuFriSatSun
27282930310102
03040506070809
10111213141516
17181920212223
24252627282930
August 2018
MonTueWedThuFriSatSun
30310102030405
06070809101112
13141516171819
20212223242526
27282930310102
July 2018
MonTueWedThuFriSatSun
25262728293001
02030405060708
09101112131415
16171819202122
23242526272829
30310102030405
June 2018
MonTueWedThuFriSatSun
28293031010203
04050607080910
11121314151617
18192021222324
25262728293001
May 2018
MonTueWedThuFriSatSun
30010203040506
07080910111213
14151617181920
21222324252627
28293031010203
April 2018
MonTueWedThuFriSatSun
26272829303101
02030405060708
09101112131415
16171819202122
23242526272829
30010203040506
February 2018
MonTueWedThuFriSatSun
29303101020304
05060708091011
12131415161718
19202122232425
26272801020304
January 2018
MonTueWedThuFriSatSun
01020304050607
08091011121314
15161718192021
22232425262728
29303101020304
December 2017
MonTueWedThuFriSatSun
27282930010203
04050607080910
11121314151617
18192021222324
25262728293031
November 2017
MonTueWedThuFriSatSun
30310102030405
06070809101112
13141516171819
20212223242526
27282930010203
September 2017
MonTueWedThuFriSatSun
28293031010203
04050607080910
11121314151617
18192021222324
25262728293001
August 2017
MonTueWedThuFriSatSun
31010203040506
07080910111213
14151617181920
21222324252627
28293031010203
March 2017
MonTueWedThuFriSatSun
27280102030405
06070809101112
13141516171819
20212223242526
27282930310102
January 2017
MonTueWedThuFriSatSun
26272829303101
02030405060708
09101112131415
16171819202122
23242526272829
30310102030405
November 2016
MonTueWedThuFriSatSun
31010203040506
07080910111213
14151617181920
21222324252627
28293001020304
October 2016
MonTueWedThuFriSatSun
26272829300102
03040506070809
10111213141516
17181920212223
24252627282930
31010203040506
September 2016
MonTueWedThuFriSatSun
29303101020304
05060708091011
12131415161718
19202122232425
26272829300102
August 2016
MonTueWedThuFriSatSun
01020304050607
08091011121314
15161718192021
22232425262728
29303101020304
July 2016
MonTueWedThuFriSatSun
27282930010203
04050607080910
11121314151617
18192021222324
25262728293031
May 2016
MonTueWedThuFriSatSun
25262728293001
02030405060708
09101112131415
16171819202122
23242526272829
30310102030405
April 2016
MonTueWedThuFriSatSun
28293031010203
04050607080910
11121314151617
18192021222324
25262728293001
December 2014
MonTueWedThuFriSatSun
01020304050607
08091011121314
15161718192021
22232425262728
29303101020304
October 2014
MonTueWedThuFriSatSun
29300102030405
06070809101112
13141516171819
20212223242526
27282930310102