text search doesn't prioritize package name or description #1876

heckj · 2022-06-29T17:19:41Z

heckj
Jun 29, 2022
Collaborator Sponsor

Please describe the bug

The search behavior seems to downplay the text in the title or description for relevance as compared to other elements. For example, a search for "ping" - which you'd expect to highlight SwiftyPing (https://swiftpackageindex.com/samiyr/SwiftyPing) shows a plethora of other elements ranked above that package.

I suspect what's happening, based on what's visible, is that ping is being used to look up matching keywords, and the prioritized list is being provided by what matches those keywords, where I suspect it would be better to prioritize the package's name and/or description above keyword matching results.

(In this case, I knew SwiftyPing was in the results - but I was surprised that doing a search for Ping would result in it being at item 14, since as far as I can tell it's the only package that includes any "ping" functionality in the index.)

Explain the steps needed to reproduce the bug

Navigate to to https://swiftpackageindex.com/search?query=ping

What was the expected behaviour?

I expected that SwiftyPing would be the top listed package

Screenshots

The screenshot below shows the top hits (as of Jun 29, 2022) for ping:

What web browser were you using? (if relevant)

If you're reporting a bug with the front end, please include your browser information below:

Platform: Desktop
Browser: Safari
Version: Version 15.5 (17613.2.7.1.8)

Additional context

Is there anything else we might find useful?

daveverwer · 2022-06-29T18:05:16Z

daveverwer
Jun 29, 2022
Maintainer

Hi Joe!

Search is hard! 😂

I agree it's not providing great results here, but I believe it is working as designed at the moment. Now, there's a good argument that that design is wrong, but I'll give a bit of background.

First things first, if you have an exact match on package name, that package gets rated top search result regardless of anything else.

However, after that, all packages that match are listed in reverse score order. So it’s everything that matches (including matching on a partial keyword, as CollectionViewPagingLayout does) ordered by our package score.

That’s not ideal by any classification, but it’s what we have right now. Search is on our list of things to look at, and maybe it’s getting towards that time. It’s potentially a huge task though. 😅

You can find the query builder in packageMatchQueryBuilder if you’re interested in taking a look.

I wish I had a better answer for you.

0 replies

heckj · 2022-06-29T19:16:50Z

heckj
Jun 29, 2022
Collaborator Author Sponsor

All good - thanks for pointing me into the code, I'll poke around a bit more. And yeah, I'm all too aware that search is seemingly simple and brutally complex, especially for subjective results such as what I was reporting. I assume any potential other solution should rely as much as possible on what's available and provided by the database? (that is, available to return via a SQL query of some form?)

Postgres does have some full-text search capabilities - although they need to be relevantly configured, so I might poke there to see how hard it would be to set up.

0 replies

daveverwer · 2022-06-29T19:33:34Z

daveverwer
Jun 29, 2022
Maintainer

I wasn't suggesting you tackle it, just that you might be interested 👍

I'll look forward to a fully redesigned and reimplemented search system by the time I wake up! 😂

0 replies

heckj · 2022-06-29T19:35:05Z

heckj
Jun 29, 2022
Collaborator Author Sponsor

It won't happen that fast, but I'll poke around a bit - I'm curious what's available and it's a good excuse to lean into Vapor a bit.

0 replies

heckj · 2022-06-30T17:01:00Z

heckj
Jun 30, 2022
Collaborator Author Sponsor

Did a bit of digging, and I managed to search on a word that happens to be quirky when it comes to matching in English. "Ping" overlaps a bit excessively with keywords that it shouldn't match with (mapping, shopping, typing) because ping overlaps with those character sequences, even though it has a semantically different meaning.

So the good news is that most search terms likely won't exhibit this pathological behavior.

I took a bit of time to understand the score that's used to order the results. Side note: if you're curious about distribution, I made little map of it and GitHub stars. Ordering by score in this case won't make a bit of difference, because it's more related to the characters ping overlapping with mapping and other variants of keywords.

A potential solution is to leverage the Postgres built-in full-text search capabilities, specifically to_tsvector and to_tsquery:

to_tsquery normalizes the search terms
to_tsvector normalizes the content you're searching
Both of which convert words to lexemes (their normalized forms - so things like ing are stripped where you're looking at adverbs. So something like mapping becomes map, while ping remains ping.

You can then compare against them using the @@ operator - which returns a boolean result, so it's perfect for an additional element in a where clause in SQL.

I did some digging to verify that most of the search terms were in English (the stemming/normalizing is VERY specific to language), so any enhancements here would need to be aware of the language in use. In scanning through a data dump, I didn't see any keywords that were in a language other than English, but there are a number of instances where the summary (also searched) is in another language - a couple of what I think were mandarin, one in Japanese, etc. Those were easy to spot because of the different alphabet. I also scanned, but not exhaustively, for European languages (guessing german, French, or Italian might be there) - but didn't see any samples in my previewing scan (just visual).

Given how the search is coded today, I suspect it's possible to apply the Postgres stemming/word normalization functions to keywordMatchQueryBuilder, replacing the where clause with some updated logic that uses the search terms vs. keywords after they're normalized - the line:
.where(keyword, ilike, SQLBind(searchPattern))

That said, I'm not very familiar with SQLKit, so I'm sorting out how to convert my raw SQL experiments into swift code using SQLKit to even see if it would help/work - and how much of a performance impact my crazy idea would have.

Right now, the code's doing the rough equiv of WHERE keyword ILIKE searchPattern, and I'd like to convert that into something like (pseudo code):

WHERE to_tsquery(search_words) @@ to_tsvector(coalesce(keyword,''))

(Not sure how to handle converting keywords as an array into the ts_vector type, but I think that can be done - idea being to slap them all together, then run the stemmer over the whole set)

As a broader solution, if the language of the summary is English, this same concept might be useful to be employed searching the contents of the summary fields that's maintained locally. That begs the question on support for other languages, but might still be an interesting option.

0 replies

daveverwer · 2022-07-01T07:44:04Z

daveverwer
Jul 1, 2022
Maintainer

I took a bit of time to understand the score that's used to order the results. Side note: if you're curious about distribution, I made little map of it and GitHub stars. Ordering by score in this case won't make a bit of difference, because it's more related to the characters ping overlapping with mapping and other variants of keywords.

I don't have access to that link, but I'd love to see it. I'd imagine it maps embarrassingly closely to GitHub stars right now. Something I'd love to fix.

0 replies

daveverwer · 2022-07-01T08:02:02Z

daveverwer
Jul 1, 2022
Maintainer

I'd agree we probably don't have many/any keywords in the database that are in the Latin alphabet that are not English apart from terms like bezier and acronyms or similar.

I'd be interested to see this move forward into something we can test (if that's not too much more work). I've no problem with things that make English-language search better as long as they don't break searches for other languages completely.

From a quick look through some data. Here might be some good searches to test:

0 replies

heckj · 2022-07-01T15:54:34Z

heckj
Jul 1, 2022
Collaborator Author Sponsor

@daveverwer Actually - the distribution is nothing like the GitHub stars setup. It's a pretty good uniform curve from what I'm seeing. I suck at using Observable, so I've fixed up and published the link - apparently the earlier one is "just for me?" - anyway: https://observablehq.com/d/acc9f659f95e8e58/ should be visible for you now. I made it unlisted.

The test searches are perfect, thank you!!

0 replies

daveverwer · 2022-07-02T08:35:37Z

daveverwer
Jul 2, 2022
Maintainer

That's fascinating, thanks Joe! What an amazing tool Observable is, too.

0 replies

heckj · 2022-07-06T17:49:17Z

heckj
Jul 6, 2022
Collaborator Author Sponsor

I've been stepping back from fiddling with the server code a bit to try and wrap some numbers around "good search results/bad search results" and more specifically how to make some reasonable estimate of if a search result implementation is better (or worse).

Fortunately, there's a good bit of prior research out on this topic, and there's a couple of algorithms that I'm think would be good to use, and wanted to present them here for discussion, if anyone else is interested in the topic:

Measures

The two key measures - which are implicitly balanced against each other in any search - are recall and precision. Both are effectively percentages, and come as a set. The formulas/algorithms for them:

Recall - the # of relevant documents / total number of relevant documents
Precision - the # of relevant documents / the number of documents retrieved

You can overbalance in either direction, and "best" is a balancing act. All of this is based on subjective measures - "Is this specific result "relevant" to my search - which puts it very much in the eye of the beholder." There's a couple of other quirks with these measurements - specifically recall - in that you rather have to "know" the whole set of documents that could be relevant to get to a resulting percentage.

Also know that you can game the recall numbers by simply returning everything, the precision of that shit drops way down though - so it's more a measure that's used when comparing two search implementations where you see relevant results in one search, but not another.

There's a secondary algorithm that's more useful for fine-tuning that's evolved from the web-search studies over the past 20+ years, called "mean reciprocal rank" - which adds weighting to "is it relevant or not" for the position in the search results, giving more credit for relevant results higher in the return list than not. This banks on the whole idea of "above the fold" and browser-based searches where us humans are far less likely to iterate through paginated search results to see what else might have been uncovered.

The weighting mechanisms are often as not matched with a non-binary choice of relevant - instead of "yes/no", in some search situations it's quite reasonable to see "yes, definitely", "somewhat", and "no" - or even finer gradations. This kind of the result is more focused on broader narrative text and phrase searching, by the graduated levels of relevance might still be very useful in the Package Index context.

Current Index Search

The current search primarily uses SQL ilike, which is a perfect starting point for search, and y'all have built up a nice combination that includes looking not only at package names, but aligned keywords, and even authors. Since we don't specifically know the intent of searches up front - the person searching could quite reasonably be looking for all packages in the index by an author (to see what else they might have available, or to see if they're in the index at all) as well as searching for a packages by name, or searching for packages by what they build on.

Measuring the current for comparisons

Ultimately, measuring relevance means someone (or better - multiple someones) going through some searches and rating the results. The most ideal situation is the person wanting to do the search providing a ranking, but in practice that's usually not very feasible. (There's some algorithmic feedback loops involving click-through recording that try to put something in this place, but that's a level of effort I suspect unwarranted for the index) That said, I think a few people all independently making an evaluations of search results, using our personal experience to do best-effort at inferring the search intent, would result in reasonable values.

To that end, I'm cobbling up a utility app that can make and store searches, and getting set to add the capability to store relevant rankings on the individual results, with the idea that I'd like to do some "comparison" searching between different techniques on the server (so comparing the hosted version to something I run locally, with a reasonably close database dump) - or to allow anyone else to do the same. You could even search over time, but since the index is fairly constantly growing and changing with submissions, you'd have to be more careful with assertions about the quality of search.

(The app is open & public on GitHub if you're curious or want to poke: https://github.com/heckj/SPISearch).

My idea was to capture the searches, add the ranking, and have the app help provide the measures to compare searching against the current implementation vs. something you might be running locally to test/work out - and being able to get measured values of the comparison in the terms of the measures I outlined above.

The ask

While I'm getting the utility app functional, I wanted to get any community feedback on the measures I was planning on using - to see if there's other measures or algorithms that anyone would like to suggest - and to see what people thought of graduated relevance (relevance measured as a value 0...something as opposed to a binary "yes/no") for making measurements.

4 replies

finestructure Jul 7, 2022
Maintainer

Very interesting, thanks for sharing, Joe!

You can overbalance in either direction, and "best" is a balancing act.

Is that truly the case? Reading through https://en.wikipedia.org/wiki/Precision_and_recall it seems to me these aren't pitted against each other but rather track the type I and type II error rates (false positives and false negatives). An ideal search would return only relevant results (no false positives) nor miss any relevant results (false negatives) and therefore the 100% on both specificity and selectivity.

Having said that, a lot of this will probably come down to ranking, because results further down the list effectively become irrelevant by position.

heckj Jul 7, 2022
Collaborator Author Sponsor

You're correct - they track the different positive and negative error rates. What I meant about balancing is that implementations can end up trading one for the other, and in some cases (quite a few related to stemming and word normalization), the evaluation ends up being a bit of balancing act - we lost a few on recall, but gained a notable amount on precision, being a relatively common outcome.

But yes, the placement of relevant results in the ranking (I think) has the biggest perceptual impact of the results as it plays into human nature, and our desire to "want to see what we're after right at the top". That's why I also proposed using mean reciprocal rank to get a number that rates a given set of results that can be used effectively in comparisons with alternate implementations. There have been a variety of weightings tried for the ranked-relevance judgement number, and I'm game to use others - figured I'd start with a suggest of at least a common one.

heckj Jul 11, 2022
Collaborator Author Sponsor

I have a simple utility app available via testlink: SPISearch (macOS & iOS), and a number of searches captured (from earlier in this bug-turned-discussion thread) and captured with the initial metrics. These searchrank docs are available on Github, or you can play with the app yourself (if anyone is so inclined) and see how the metrics I'm suggesting work out.

I'm going to continue to update this utility app for a couple more iterations before putting it to use to try out some alternate search responses, leveraging Postgres' ts_vector for (hopefully) better results. In particular, I'm hoping to use it to make it easier to capture a test result from a locally run development environment mirroring the online service, as well as being able to print and dump the metrics from multiple searches a bit more easily to make it easier to see and compare results as I try things out.

heckj Jul 14, 2022
Collaborator Author Sponsor

I've got the utility app functional sufficient to comparing server to local development results. Found a few more flaws in my logic last night, but fixed them this morning and will be pushing a few more testflight builds later today, if anyone else is using it.

heckj · 2022-07-14T18:15:29Z

heckj
Jul 14, 2022
Collaborator Author Sponsor

I have an initial trail in a branch that i looked at with metrics, starting with just the keyword matching mechanisms. There's a quirk here that's not unexpected, but not ideal that i wanted to point out.

When using the tsvector to normalize and then compare words, there's a common set that seems to get dropped: compound words that include terms prefixed by either NS or UI. I spotted this while reviewing the keyword search changes, as the keyword 'uibezierpath' was dropped for the keyword search 'bezier' when using the tsvector comparison (vs ILIKE which is what's happening currently).

I'm noodling on some ideas to allow for this, at least for the most common prefixes that we'll encounter ('NS', 'UI'), to hopefully preserve the recall. The steamer mechanism isn't exposed in Postgres, intentionally so - but i have some whack ideas for keeping the stock postgres infrastructure and pre-processing in some limited fashions to help alleviate this. It might not be worth it, but i'm kind of curious what I can get away with within this context.

This initial test didn't really expose what or how non-english words or alphabets might be impacted. That's next on my investigation list.

4 replies

heckj Jul 15, 2022
Collaborator Author Sponsor

A little update from today's exploration:

I worked primarily from SQL making various requests to mimic some of the core pieces of the searches in the latest test database, using stock to_tsvector, to_tsquery, and exploring a bit of tsrank(). Take-aways:

the UI/NS normalization issue definitely stands out, especially for the sample query "bezier" - especially when you start to limit the query to only returning where ts_vector matches.
the foreign character languages really suffer, and two sample queries (in mandarin) both fail to find any results when ts_vector is applied. (This isn't really surprising, as ts_vector is VERY specific to the language it's processing, English in this case)
non-english was minimally effected, but I didn't have a lot of exploration here - and I'm certain that the words don't fully normalize out (so searching for non-english terms in the summary of a package are likely to fail). The sample case I had at hand was a German package name, which didn't have any issues since it was a near-on perfect match.
on the plus side, the sample search which sent me off down this rabbit hole ('ping') worked marvelously well, not returning any irrelevant results.

ts_rank providing a ranking value - and with the initial results I was seeing varied between 0.06 and 0.2. I have a hypothesis that ordering by ranking may improve the search results via ordering. The results are currently ordered by an internally computed score - which has a notably larger distribution of 0 to ~130 by my reading of the source currently. I haven't sorted out what the actual possible ranges are for ts_rank(), which I suspect is going to require reading and understanding the related Postgres source code. This digging on ts_rank() and score was based on a pending question in my head "Can we legitimately fuse the results together, leveraging both values in a useful fashion?" - the proof of which would be "does it result in better relevancy scores for searches".

However, I'm not certain we'll get anything notable in value from ts_rank, so I'm going to back-burner that idea, push forward with keeping the ranking based on internally computed score.

The two issues I'm seeing with just applying ts_vector directly (and perhaps naively) is that we'll drop some relevant search results - the two clear cases based on my testing so far:

search terms that you'd expect to match with a UI... or NS... preview don't with ts_vector (NSView from the search term 'view' for example)
search terms that rely on non-english characters fail to return any results (I've two examples of search terms in mandarin: 作为弹幕 and 具方法, both of which return values with the currently existing search mechanisms.

The lack of results for foreign language searches, to me, is a killer on this idea. Because of that, I'm brainstorming on how to handle that scenario while still getting some benefit from using ts_vector within Postgres. I suspect the ILIKE %...% queries are going to be optimal for searches in mandarin (or any foreign character search terms - Japanese, Russian, Arabic, etc). At least barring some additional multi-lingual search technology - the likes of which certainly isn't built in to stock Postgres.

I do have some pre-processing ideas to accommodate normalizing the UI and NS prefix use cases - but nothing is solid (or terribly obvious) yet, and I want to be very cognizant of the potential impact to the latency for search times. So any pre-processing work I'd likely try to aim to establishing in SQL functions so that we could use that to continue to build the materialized search view to keep searches fast.

finestructure Jul 18, 2022
Maintainer

Thanks for the update, Joe! One thought I had, and I think it's what you're suggesting as well, is to use a "word boundary match" (which is what I think ts_vector essentially is) for ranking and not for finding.

For instance, right now we tack on an exact package name boolean to boost exact package matches to the top. I would imagine your original problem to fish the ping word matches out of the see of "flipping matches" that were boosted by a higher but mostly irrelevant score could be solved if we sorted by rank, score and drove rank off of ts_vector.

It could be a very coarse rank so we don't boost too much and make score useless as a consequence. Maybe even just essentially a boolean for starters: "has any word boundary matches in the haystack string". That could replace our current "hack" of the exact package name match and extend it to word matches in other places.

Although exact package name matches might still need a little extra boost - but these are essentially parameter tweaks then: how much to boost depending on which field you're matching on. A word match in the package name should be more important than one in a sprawling summary.

heckj Jul 18, 2022
Collaborator Author Sponsor

I was actually heading in the opposite direction originally, but I think your suggestion is a good way to tackle it. Instead of focusing on boosting the values that are good, I was trying to remove (not find at all) the results that were irrelevant - which led to that variety of good results not being exposed, which kind of sucked - most especially for non-english languages or character sets. I'd even extended the concept in my head down through comparing the search terms to ascii-letters and then doing the find in different search paths. In all of those cases, I was doing quite a lot of algorithmic/heuristic work to keep the recall from plummeting (missing good results that should be there).

If we stick with the ILIKE %something% sequence, that definitely gets us fantastic recall. The downside is that its returning irrelevant results as well. The numbers from ts_rank (the postgres ranking function that leverages a ts_vector of results) with score is returned - but the examples are only showing how it applies when ts_vector is the WHERE determiner. I'm unsure if we can get rankings for entries that ILIKE returns. If we can't, then we'll have a set of queries that doesn't align. We can make another of the sets of queries that are currently merged with a union, but they'll have fewer entries than the ILIKE queries, in some cases by quite a lot. That seems to be the best place to start, which pushes the problem down to how to merge those values and combine the ranking value (when available) and the score, which is currently ranking things.

In this path, I think it's important to find a way to use the ranking value as a primary value, and adjust it - but only slightly - with the result of the score algorithm that's already in place. My reasoning is that the score represents a sense of "package quality", and not really anything about the relevance as it pertains to any specific search request. In the case where there's several packages that match, I can see definitely wanting the "higher quality" package to get surfaced earlier as a pre-judgement that it'll be the better result. But in the cases were the %ILIKE% results return completely irrelevant results to the query, the quality of those excellent packages shouldn't drown out the relevancy metric.

My current suspicion (and limited observation) is that the relevancy rank from ts_rank is a non-linear ranking curve with a fairly long tail, where score is a nice standard distribution around the packages that you could normalize into a unit value with a generally uniform distribution. Figuring out how much to weight that uniform distribution so that it doesn't wash out the upper end of that ranking curve is the heart of the next thing to figure out to enable this kind of solution.

Anyway, I was originally leaning into the tactic of using ts_vector for the finding in order to remove the irrelevant results from even being found, thinking that even adjusting the inputs for ts_vector, and only taking that path when the query terms were all in english, would be an easier to implement solution than tackling the ranking boosting. But as I've hit all the edge cases and considered what it might take to adjust things to work with postgres' native mechanisms, it may be quite a bit easier to focus on boosting the relevancy (where we can get it) and merging that in to adjust only the ordering of the results.

I don't have any sense if I've just found that one outlier "that will drive you nuts", or if this is a broader problem that others are seeing as well. In the sample queries that Dave suggested, none of the results were as brutally irrelevant as the ping result set. Although to be fair, I had to use Google translate to have any hope of judging the relevance for the two searches in Mandarin, and I'm not sure I have a solid answer - but all these mechanisms wouldn't help that scenario, as they're english-language (and character) specific.

daveverwer Jul 18, 2022
Maintainer

This is amazing work, Joe. I've not been able to follow along super closely with this, but just to chime in on one point you made here:

In this path, I think it's important to find a way to use the ranking value as a primary value, and adjust it - but only slightly - with the result of the score algorithm that's already in place. My reasoning is that the score represents a sense of "package quality", and not really anything about the relevance as it pertains to any specific search request. In the case where there's several packages that match, I can see definitely wanting the "higher quality" package to get surfaced earlier as a pre-judgement that it'll be the better result. But in the cases were the %ILIKE% results return completely irrelevant results to the query, the quality of those excellent packages shouldn't drown out the relevancy metric.

I agree with this. Package quality is really important, but the package page is the primary place for that (eventually by giving it some kind of summary public package score or rating, in addition to what we already have with all other metadata). Search results by relevance with score as a (minor) secondary sounds good to me.

The issue of searching for NSTableView or UIScrollView is important, too. We can't afford to lose that.

heckj · 2022-07-19T00:53:21Z

heckj
Jul 19, 2022
Collaborator Author Sponsor

Something functional (although NOT clean) is up in a draft PR so we can start picking apart what's happening and get it out my head and step forward from my local experimental branch: #1905

One thing I wanted to call out is that I'm not currently enabling any of the fancier "binary search" capabilities that we could be using. There's multiple variations in Postgres to generate a ts_query, and I'm using the absolute simplest plainto_tsquery. The effect of this is that multiple terms are translated into "all of these terms must be present", which matches most closely to the ILIKE where clauses that are currently in play.

There is capability to use AND, OR, NOT, and "followed by" operators when constructing a ts_query - and a couple variations in getting there, but for those options to be useful at all, we'd need to be using the ts_query @@ ts_vector operator (that returns a Boolean value that indicates if the query matches). That would predicate the pain points - loosing UI/NS values due to the work tokenizer no groking the technical prefixes, and loosing any non-english-character search capabilities (for example, mandarin characters)

The initial PR doesn't do any boosting of rank by score - but I did do experiments and multiple iterations (mostly directly in SQL) to verify that - contrary to the documentation - Postgres tends to return a ranking value maxing out at 1.0, and with a minimum value or either 0 or 1e20. The documentation showed ranking examples stepping up to '3' in some cases - which I've been unable to reproduce.

I did comb through (to my limited ability) the Postgres source on ts_rank, and the experimental results matches what I thought might be happening from reading the code - but I'm far from a super-competent C programming to know for sure, and the documentation showed an example that was very different, so I mostly backed this with active experimentation. The ranking curve is non-linear, falling off with some averaged value of 1.0 to 0.0 in an n/n^2 kind of curve (for example a sequence like 1, 0.5, 0.25, 0.125).

Based on that, my thinking is a score-based boost (for otherwise equivalent rankings) might be best achieved by normalizing score into a 0...0.1 range and directly adding that onto the rank. The exact factor is a guess - that might overwhelm some relevance values, but I think that's low enough that it would end up sorting any "irrelevant" results by quality-of-package.

I haven't measured all the various numbers with my utility SPISearch app yet, but will do so and provide the results - either here, or within the PR - if that would be more meaningful.

2 replies

finestructure Jul 19, 2022
Maintainer

Great stuff, thanks Joe! I've left some initial comments in the PR. My time is probably going to be a bit limited the next week or two, unfortunately, but I'll do my best to follow up.

heckj Jul 21, 2022
Collaborator Author Sponsor

I did a bit of profiling, and the EXPLAIN SQL gave me back different query performances on iteration. Prior to the changes, the first explain reported ~97ms for the stock search query, dropping to ~70ms with repeated queries.

SELECT * FROM 
((SELECT DISTINCT 'author' AS "match_type", 
  NULL AS "keyword", 
  NULL::UUID AS "package_id", 
  NULL AS "package_name", 
  NULL AS "repo_name", "repo_owner", 
  NULL::INT AS "score", 
  NULL AS "summary", NULL::INT AS "stars", 
  NULL AS "license", NULL::TIMESTAMP AS "last_commit_date", 
  NULL::TIMESTAMP AS "last_activity_at", NULL::TEXT[] AS "keywords", 
  NULL::BOOL AS "has_docs", LEVENSHTEIN("repo_owner", 'bezier') AS "levenshtein_dist" 
  FROM "search" WHERE "repo_owner" ILIKE '%bezier%' ORDER BY "levenshtein_dist" LIMIT 50) 
 UNION ALL (SELECT DISTINCT 'keyword' AS "match_type", "keyword", NULL::UUID AS "package_id", 
			NULL AS "package_name", NULL AS "repo_name", NULL AS "repo_owner", NULL::INT AS "score", 
			NULL AS "summary", NULL::INT AS "stars", NULL AS "license", 
			NULL::TIMESTAMP AS "last_commit_date", NULL::TIMESTAMP AS "last_activity_at", 
			NULL::TEXT[] AS "keywords", NULL::BOOL AS "has_docs", 
			LEVENSHTEIN("keyword", 'bezier') AS "levenshtein_dist" 
			FROM "search", UNNEST("keywords") AS "keyword" 
			WHERE "keyword" ILIKE '%bezier%' ORDER BY "levenshtein_dist" LIMIT 50) 
 UNION ALL (SELECT 'package' AS "match_type", NULL AS "keyword", "package_id", "package_name", "repo_name", "repo_owner", "score", "summary", "stars", "license", "last_commit_date", "last_activity_at", "keywords", "has_docs", NULL::INT AS "levenshtein_dist" 
			FROM "search" 
			WHERE CONCAT_WS(' ', "package_name", COALESCE("summary", ''), "repo_name", "repo_owner", 
							ARRAY_TO_STRING("keywords", ' ')) ~* 'bezier' AND "repo_owner" IS NOT NULL AND "repo_name" IS NOT NULL 
			ORDER BY LOWER("package_name") = 'bezier' DESC, "score" DESC, "package_name" ASC LIMIT 21 OFFSET 0)) AS "t"

By comparison the completely inline ts_vector variant I created returned an initial search time of 145ms, dropping first to 111ms and then to 70ms on repeated iterations of the query.

     SELECT * FROM (
        (
            SELECT DISTINCT 'author' AS "match_type", NULL AS "keyword", NULL::UUID AS "package_id", NULL AS "package_name", NULL AS "repo_name", "repo_owner", NULL::INT AS "score", NULL AS "summary", NULL::INT AS "stars", NULL AS "license", NULL::TIMESTAMP AS "last_commit_date", NULL::TIMESTAMP AS "last_activity_at", NULL::TEXT[] AS "keywords", NULL::BOOL AS "has_docs", LEVENSHTEIN("repo_owner", 'bezier') AS "levenshtein_dist", ts_rank("tsvector", "tsquery") AS "tsrankvalue"
            FROM "search", plainto_tsquery('bezier') AS "tsquery", setweight(to_tsvector("repo_owner"), 'A') AS "tsvector"
            WHERE "repo_owner" ILIKE '%bezier%'
            ORDER BY "levenshtein_dist"
            LIMIT 50
        )
     
         UNION ALL (
            SELECT DISTINCT 'keyword' AS "match_type", "keyword", NULL::UUID AS "package_id", NULL AS "package_name", NULL AS "repo_name", NULL AS "repo_owner", NULL::INT AS "score", NULL AS "summary", NULL::INT AS "stars", NULL AS "license", NULL::TIMESTAMP AS "last_commit_date", NULL::TIMESTAMP AS "last_activity_at", NULL::TEXT[] AS "keywords", NULL::BOOL AS "has_docs", LEVENSHTEIN("keyword", 'bezier') AS "levenshtein_dist", ts_rank("tsvector", "tsquery") AS "tsrankvalue"
            FROM "search", UNNEST("keywords") AS "keyword", plainto_tsquery('bezier') AS "tsquery", setweight(to_tsvector("keyword"), 'B') AS "tsvector"
            WHERE "keyword" ILIKE '%bezier%'
            ORDER BY "tsrankvalue"
            LIMIT 50
         )
     
         UNION ALL (
            SELECT 'package' AS "match_type", NULL AS "keyword", "package_id", "package_name", "repo_name", "repo_owner", "score", "summary", "stars", "license", "last_commit_date", "last_activity_at", "keywords", "has_docs", NULL::INT AS "levenshtein_dist", ts_rank("tsvector", "tsquery") AS "tsrankvalue"
            FROM "search", plainto_tsquery('bezier') AS "tsquery", setweight(to_tsvector(coalesce(summary, '') || coalesce(package_name, '') || coalesce(ARRAY_TO_STRING(keywords, ' '), '')), 'A') AS "tsvector"
            WHERE CONCAT_WS(' ', "package_name", COALESCE("summary", ''), "repo_name", "repo_owner", ARRAY_TO_STRING("keywords", ' ')) ~* 'bezier' AND "repo_owner" IS NOT NULL AND "repo_name" IS NOT NULL
            ORDER BY LOWER("package_name") = 'bezier' DESC, "tsrankvalue" DESC, "score" DESC, "package_name" ASC
            LIMIT 21 OFFSET 0
         )
     ) AS "t"

I'm not sure what the numbers are that would raise a yellow, or red, flag for the timing - I'd fully guess that we could optimize some of this by building in the ts_vector() functions into the weighted view creation process. Figured best to start with presenting the raw numbers first.

I also found an interesting quirk - the version that's in the PR doesn't use different weights for different sections of text. The documentation seemed to indicate that you could concatenate ts_vector results, but I'm running in SQL syntax errors when trying to apply the || operator in the from query. In case anyone has seen this sort of thing previously (I'm not a SQL expert), the following SQL works fine:

     SELECT setweight(to_tsvector(coalesce(summary, '')), 'A') || setweight(to_tsvector(coalesce(ARRAY_TO_STRING(keywords,' '))) , 'B')
     FROM search
     WHERE package_name = 'SCNBezier'

But when I tried to push the || operator into the FROM portion of the query, I get a syntax error:

     SELECT foo
     FROM search,
	 setweight(to_tsvector(coalesce(summary, '')), 'A') || setweight(to_tsvector(coalesce(ARRAY_TO_STRING(keywords,' '))) , 'B') as foo
     WHERE package_name = 'SCNBezier'

I can build the ts_vector instances separately in the FROM section, even weighted, but it doesn't seem to want to be able to concatenate them, and I'm at a bit of a loss as to why.

heckj · 2022-07-22T15:19:08Z

heckj
Jul 22, 2022
Collaborator Author Sponsor

I had a large pile of exploratory SQL queries and learning comments that seem like they'd be better here, so I'm pulling that out from the PR work and stashing it here in case anyone wants to reference it.

Exploratory SQL commands, specific to Postgres that leverage the existing
'search' materialized view, and explore the full-text search capabilities and
functions included within Postgres:

select to_tsvector('stopping watching ping') @@ to_tsquery('ping')

returns true

The @@ operator applies a query to a vector to determine if the query would match
the results in the vector.

The coalesce function ensures that a NULL result from a column just results in an
empty string, otherwise a tsvector(NULL) is NULL and cascades to not return any values.

select * from search where to_tsvector(coalesce(summary, '') || coalesce(package_name, '')) @@ to_tsquery('ping')

select package_name, to_tsvector(ARRAY_TO_STRING(keywords, ' ')) from search

select to_tsvector('stopping watching ping') @@ to_tsquery('ping')

select * from search where to_tsvector(coalesce(summary, '') || coalesce(package_name, '')) @@ to_tsquery('ping')

The || concatenating operator above is concatenating the strings, which then gets converted into
a tsvector data type. The || operator should also be able to concatenate ts_vector types.

SELECT package_name, keywords_vector, ts_rank(keywords_vector, query) AS rank
FROM search, to_tsquery('bezier') query, to_tsvector(ARRAY_TO_STRING(keywords, ' ')) keywords_vector
WHERE query @@ keywords_vector
ORDER BY rank DESC;

Exploring the vector spaces

SELECT package_name, keywords, combined_vector, query, ts_rank(combined_vector, query) AS rank
FROM search,
  plainto_tsquery('bezier') query,
  to_tsvector(ARRAY_TO_STRING(keywords, ' ')) keywords_vector,
  to_tsvector(package_name) name_vector,
  to_tsvector(summary) summary_vector,
  to_tsvector(coalesce(summary, '') || coalesce(package_name, '') || coalesce(ARRAY_TO_STRING(keywords, ' '), '')) combined_vector
WHERE query @@ combined_vector
ORDER BY rank DESC;

exploring weighting values and how they apply

SELECT package_name, keywords, combined_vector, query, ts_rank(combined_vector, query) AS rank
FROM search,
  to_tsquery('ux & (bezier | arrow) ') query,
  to_tsvector(ARRAY_TO_STRING(keywords, ' ')) keywords_vector,
  to_tsvector(package_name) name_vector,
  to_tsvector(summary) summary_vector,
  to_tsvector(coalesce(summary, '') || coalesce(package_name, '') || coalesce(ARRAY_TO_STRING(keywords, ' '), '')) combined_vector
WHERE query @@ combined_vector
ORDER BY rank DESC;

-- vs.

SELECT package_name, keywords, combined_vector, query, ts_rank(combined_vector, query) AS rank
FROM search,
  to_tsquery('ux & (bezier | arrow) ') query,
  to_tsvector(ARRAY_TO_STRING(keywords, ' ')) keywords_vector,
  to_tsvector(package_name) name_vector,
  to_tsvector(summary) summary_vector,
  setweight(to_tsvector(coalesce(summary, '') || coalesce(package_name, '') || coalesce(ARRAY_TO_STRING(keywords, ' '), '')), 'A') combined_vector
WHERE query @@ combined_vector
ORDER BY rank DESC;

The results of the above, varying the weighting to see the effect:

Weight class A -> .99
Weight class B -> .54
Weight class C -> .3
Weight class D -> .15 // AND THE DEFAULT IF NOT SPECIFIED!! //

Contrary to what the documentation example shows, I couldn't arrange a rank result that
exceeded 1.0, which matches with what I thought I understood from the underlying code. I was
able to get it up to a full 1.0 value with a lot of repeated & query parameters.

Using tsvector @@ tsquery as the boolean to indicate a WHERE statement is super-specific
to ASCII english content, and fails with some variations.

As an example, the query for bezier is failing to return values that have UIBezierCurve
in the collection of strings that get converted into tsvector values. Terms prefixed with
UI or NS (as in common Apple type names in Swift) won't match to the underlying lexemes, as
they aren't stripped in the stemming.

One option to work around this is specifying all possible values into a thesaurus, which augments
the stemming/normalization of words when to_tsvector() is called on the string. Another is to
pre-processing the text content to manually augment it so that you don't need to know every
variation of the work a-priori, and can apply a pattern (keep the original, and add another
term in parallel that is the value after stripping the 'UI' prefix).

SELECT package_name, ts_rank(combined_vector, query) AS rank, combined_vector
FROM search,
  to_tsquery('ux & (bezier | arrow) ') query,
  to_tsvector(ARRAY_TO_STRING(keywords, ' ')) keywords_vector,
  to_tsvector(package_name) name_vector,
  to_tsvector(summary) summary_vector,
  setweight(to_tsvector(coalesce(summary, '') || coalesce(package_name, '') || coalesce(ARRAY_TO_STRING(keywords, ' '), '')), 'A') combined_vector
WHERE query @@ combined_vector
ORDER BY rank DESC;

The ranking values returned from exploration: bezier

"package_name"    "rank"    "combined_vector"
"SCNBezier"    0.082745634    "'anim':2,14 'bezier':4,11,13,16 'bezier-anim':12 'bezier-curv':15 'cocoapod':18,19 'creat':1 'curv':5,17 'framework':23 'manag':28 'number':8 'packag':27 'pointsscnbezierarkit':10 'scenekit':20,22 'scenekit-framework':21 'swift':24,26 'swift-package-manag':25 'swiftpm':29"
"Arrows"    0.06079271    "'anim':4 'arrow':1,15 'bezier':17 'bezier-path':16 'core':20 'core-graph':19 'custom':5 'feedback':9 'give':8 'graphic':21 'indic':22 'menu':26 'panel':23 'panels.arrowsanimations':14 'path':18 'slide':13,25 'sliding-menu':24 'ui':12 'ux':27 'view':6"
"ArrowView"    0.06079271    "'arrow':10 'bezier':22 'bezier-path':21 'draw':5 'effect.arrowviewarrow':20 'end':13 'io':2,24 'line':7 'nice':18 'path':23 'pm':27 'simpl':1 'swift':26 'swift-pm':25 'uibezierpath':15,28 'uiview':29 'use':14 'view':3 'wavi':19"
"SimpleRoulette"    0.06079271    "'anim':19 'bezier':5 'carthag':6 'chart':7,8 'cocoapod':9 'compon':29 'creat':1 'ease.simplerouletteangle':4 'framework':32 'io':10,12 'ios-swift':11 'maco':14 'pod':15 'roulett':2,16,18,21 'roulette-anim':17 'roulette-wheel':20 'simpl':23 'spm':24 'swift':13,25 'swiftui':26,28,31 'swiftui-compon':27 'swiftui-framework':30 'wheel':22"

Same sample, except for wtv

"package_name"    "rank"    "combined_vector"
"WTV"    0.06079271    "'reflect':6 'swift':7 'variabl':4 'wtv':8 'wtvmirror':5"
"DataObject"    0.06079271    "'dataobject':1,4 'dataobjectani':3 'json':5 'object':6 'surl':7 'swift':8 'wtv':9"

Sample for ping

"package_name"    "rank"    "combined_vector"
"SwiftyPing"    0.06079271    "'5swiftypingicmp':7 '5swiftypingicmp-ping':6 'client':3 'icmp':1 'network':9 'ping':2,8 'swift':5 'swift4':10 'swift5':11"

Sample for guass-krueger

"package_name"    "rank"    "combined_vector"
"GaussKrueger"    0.26832977    "'convert':1 'coordin':3 'gauss':10 'gauss-krueg':9 'gk4':2 'krueger':11 'versagausskruegercoordin':8 'vice':7 'wgs84':5,12"

Sample for iso639-1

"package_name"    "rank"    "combined_vector"
"ISO639"    0.26832977    "'-1':7 '-2':9 'iso639':1,6,8 'iso639iso639':5 'languag':2,10,11 'librari':15 'manag':19 'packag':18 'swift':4,12,14,17 'swift-librari':13 'swift-package-manag':16 'swift5':20"

Sample for 具方法

zero results

I think we want to use ts_rank INSTEAD of ts_rank_cd because most of the search queries are single
terms, and the cover distance value (returned as a rank parameter) doesn't help in single-word queries.

ts_rank can take a normalization parameter - and integer that reflects how the ranking values
are normalized. The value of 32 (used in the ts_rank_cd example below) indicates: rank/(rank+1),
which forces the ranking between 0 and 1. The default value is 0, which provides no normalization.

The normalization factor is always related to the document length, which in our case isn't
very useful since all the documents are roughly the same length. In the case of ts_rank_cd,
it can also be related to the distance between search values - the distances between them
in tsvector.

On the plus side, we can get a rank for a non-matching term that reliably comes back as 0, so we can
work with that to provide additional tweaking. Example:

SELECT query, a_vector, ts_rank(a_vector, query) AS rank
FROM plainto_tsquery('fiddlesticks') query,
  to_tsvector('Heres my example package of swift package manager') a_vector
-- WHERE query @@ a_vector
ORDER BY rank DESC;

The union query implemented in the first cut of this:

SELECT * FROM (
(
    SELECT DISTINCT 'author' AS "match_type", NULL AS "keyword", NULL::UUID AS "package_id", NULL AS "package_name", NULL AS "repo_name", "repo_owner", NULL::INT AS "score", NULL AS "summary", NULL::INT AS "stars", NULL AS "license", NULL::TIMESTAMP AS "last_commit_date", NULL::TIMESTAMP AS "last_activity_at", NULL::TEXT[] AS "keywords", NULL::BOOL AS "has_docs", LEVENSHTEIN("repo_owner", 'bezier') AS "levenshtein_dist", ts_rank("tsvector", "tsquery") AS "tsrankvalue"
    FROM "search", plainto_tsquery('bezier') AS "tsquery", setweight(to_tsvector("repo_owner"), 'A') AS "tsvector"
    WHERE "repo_owner" ILIKE '%bezier%'
    ORDER BY "levenshtein_dist"
    LIMIT 50
)

 UNION ALL (
    SELECT DISTINCT 'keyword' AS "match_type", "keyword", NULL::UUID AS "package_id", NULL AS "package_name", NULL AS "repo_name", NULL AS "repo_owner", NULL::INT AS "score", NULL AS "summary", NULL::INT AS "stars", NULL AS "license", NULL::TIMESTAMP AS "last_commit_date", NULL::TIMESTAMP AS "last_activity_at", NULL::TEXT[] AS "keywords", NULL::BOOL AS "has_docs", LEVENSHTEIN("keyword", 'bezier') AS "levenshtein_dist", ts_rank("tsvector", "tsquery") AS "tsrankvalue"
    FROM "search", UNNEST("keywords") AS "keyword", plainto_tsquery('bezier') AS "tsquery", setweight(to_tsvector("keyword"), 'B') AS "tsvector"
    WHERE "keyword" ILIKE '%bezier%'
    ORDER BY "tsrankvalue"
    LIMIT 50
 )

 UNION ALL (
    SELECT 'package' AS "match_type", NULL AS "keyword", "package_id", "package_name", "repo_name", "repo_owner", "score", "summary", "stars", "license", "last_commit_date", "last_activity_at", "keywords", "has_docs", NULL::INT AS "levenshtein_dist", ts_rank("tsvector", "tsquery") AS "tsrankvalue"
    FROM "search", plainto_tsquery('bezier') AS "tsquery", setweight(to_tsvector(coalesce(summary, '') || coalesce(package_name, '') || coalesce(ARRAY_TO_STRING(keywords, ' '), '')), 'A') AS "tsvector"
    WHERE CONCAT_WS(' ', "package_name", COALESCE("summary", ''), "repo_name", "repo_owner", ARRAY_TO_STRING("keywords", ' ')) ~* 'bezier' AND "repo_owner" IS NOT NULL AND "repo_name" IS NOT NULL
    ORDER BY LOWER("package_name") = 'bezier' DESC, "tsrankvalue" DESC, "score" DESC, "package_name" ASC
    LIMIT 21 OFFSET 0
 )
) AS "t"

With the inputs:

["bezier", "bezier", "%bezier%", "bezier", "bezier", "%bezier%", "bezier", "bezier", "bezier"]


Analyze: `98ms` results - local postgres DB with recent dump, migrated to latest.

And the counts:
```SQL
SELECT "keyword", "count" FROM "weighted_keywords" WHERE "keyword" IN ($1, $2, $3, $4, $5)

With the inputs:

["uibezierpath", "bezier-path", "bezier-curve", "bezier-animation", "bezier"]

Experiments in seeing the tsvector weighted, and then coalesced:

SELECT setweight(to_tsvector(coalesce(summary, '')), 'A') || setweight(to_tsvector(coalesce(ARRAY_TO_STRING(keywords,' '))) , 'B')
FROM search
WHERE package_name = 'SCNBezier'

Just the summary vector, weighted 'A':

'anim':2A 'bezier':4A 'creat':1A 'curv':5A 'number':8A 'point':10A

With keywords coalesced into it, weighted 'B':

'anim':2A,15B 'arkit':11B 'bezier':4A,12B,14B,17B 'bezier-anim':13B 'bezier-curv':16B
'cocoapod':19B,20B 'creat':1A 'curv':5A,18B 'framework':24B 'manag':29B 'number':8A
'packag':28B 'point':10A 'scenekit':21B,23B 'scenekit-framework':22B 'swift':25B,27B
'swift-package-manag':26B 'swiftpm':30B

2 replies

finestructure Jul 23, 2022
Maintainer

That's great, thanks Joe! I'll try work through this asap.

heckj Jul 23, 2022
Collaborator Author Sponsor

No worries or rush. Tell me when you're available, and I'm happy to hop on a call and we can talk through it - choices, tradeoffs, etc - including what we need to do before we land this. I wasn't sure, in particular, about the timing and if we should build the vectors into the materialized search view, so I'd love to go over that with you at some point.

Sherlouk · 2022-07-30T12:03:53Z

Sherlouk
Jul 30, 2022
Collaborator Sponsor

Just adding another example to this discussion, I was trying to search for FTP libraries which seems to match perfectly with "SwiftPM" and many other versions of it. So very difficult to find the library I was looking for.

8 replies

heckj Jul 30, 2022
Collaborator Author Sponsor

My hypothesis on why DittoPackage was ranking top spot wasn't correct - I did a SQL query and dumped the ranking and sorted output, and it's up there - but I think due to a bit of erroneous data in the index. It's an unusual package, so it looks like it's a separate issue altogether.

On the plus side, that's another bonus point for my implementation - still shows you the loads of crap that just happen to "grep match", but ranks the real result MUCH more favorably.

Sherlouk Jul 31, 2022
Collaborator Sponsor

Awesome work! Glad to have provided another valuable example to help test these improvements you're pushing through 😃

Definitely an unfortunate acronym..

Sherlouk Jul 31, 2022
Collaborator Sponsor

I'm sure that @daveverwer will be happy to see that LeftPad is also popping up in these results 😂

Test package woop

daveverwer Jul 31, 2022
Maintainer

LeftPad should be the top result on every search. Did I not mention that was a requirement?! 😂

heckj Jul 31, 2022
Collaborator Author Sponsor

Be careful what you wish for... 🤣

daveverwer · 2022-08-11T17:49:12Z

daveverwer
Aug 11, 2022
Maintainer

Here's another good test search that I just did for real, knowing the package was there, and expecting it to be top result

https://swiftpackageindex.com/search?query=introsp

I was expecting to find Introspect top. I bet it will be with the changes.

0 replies

finestructure · 2022-08-11T19:53:37Z

finestructure
Aug 11, 2022
Maintainer

Mmm, for a partial search string this feels like a good result, with solid matches in pos 1 and 2. I’ll try the branch tomorrow but I’d be surprised if it fared better since “introsp” is not a word.

Note that searching for “introspect” would have given you pos 1.

1 reply

daveverwer Aug 11, 2022
Maintainer

Ah yea it wasn't a terrible result, but I bet it's better with the new changes. I mainly wanted to write it down somewhere so I remember to test it

heckj · 2022-08-11T21:14:24Z

heckj
Aug 11, 2022
Collaborator Author Sponsor

No notable improvement (but not worse either):

Because it's a partial word, the lexeme normalization won't give it any specific boosting here, so it's effectively falling back to the more grep-like finding mechanism.

(tested on current keyword-search-tweak branch with the DB dump from Aug 9th)

0 replies

finestructure · 2022-08-12T09:50:49Z

finestructure
Aug 12, 2022
Maintainer

This is now up on staging!

https://staging.swiftpackageindex.com/search?query=ping

0 replies

daveverwer · 2022-08-12T09:53:11Z

daveverwer
Aug 12, 2022
Maintainer

This is incredible work. Thanks so much for all your hard work, @heckj (and to @finestructure, for shepherding it in!)

🎉

1 reply

heckj Aug 12, 2022
Collaborator Author Sponsor

Wonderful! Thank you gents!

text search doesn't prioritize package name or description #1876

heckj Jun 29, 2022 Collaborator Sponsor

Replies: 19 comments · 22 replies

daveverwer Jun 29, 2022 Maintainer

heckj Jun 29, 2022 Collaborator Author Sponsor

daveverwer Jun 29, 2022 Maintainer

heckj Jun 29, 2022 Collaborator Author Sponsor

heckj Jun 30, 2022 Collaborator Author Sponsor

daveverwer Jul 1, 2022 Maintainer

daveverwer Jul 1, 2022 Maintainer

heckj Jul 1, 2022 Collaborator Author Sponsor

daveverwer Jul 2, 2022 Maintainer

heckj Jul 6, 2022 Collaborator Author Sponsor

Measures

Current Index Search

Measuring the current for comparisons

The ask

finestructure Jul 7, 2022 Maintainer

heckj Jul 7, 2022 Collaborator Author Sponsor

heckj Jul 11, 2022 Collaborator Author Sponsor

heckj Jul 14, 2022 Collaborator Author Sponsor

heckj Jul 14, 2022 Collaborator Author Sponsor

heckj Jul 15, 2022 Collaborator Author Sponsor

finestructure Jul 18, 2022 Maintainer

heckj Jul 18, 2022 Collaborator Author Sponsor

daveverwer Jul 18, 2022 Maintainer

heckj Jul 19, 2022 Collaborator Author Sponsor

finestructure Jul 19, 2022 Maintainer

heckj Jul 21, 2022 Collaborator Author Sponsor

heckj Jul 22, 2022 Collaborator Author Sponsor

finestructure Jul 23, 2022 Maintainer

heckj Jul 23, 2022 Collaborator Author Sponsor

Sherlouk Jul 30, 2022 Collaborator Sponsor

heckj Jul 30, 2022 Collaborator Author Sponsor

Sherlouk Jul 31, 2022 Collaborator Sponsor

Sherlouk Jul 31, 2022 Collaborator Sponsor

daveverwer Jul 31, 2022 Maintainer

heckj Jul 31, 2022 Collaborator Author Sponsor

daveverwer Aug 11, 2022 Maintainer

finestructure Aug 11, 2022 Maintainer

daveverwer Aug 11, 2022 Maintainer

heckj Aug 11, 2022 Collaborator Author Sponsor

finestructure Aug 12, 2022 Maintainer

daveverwer Aug 12, 2022 Maintainer

heckj Aug 12, 2022 Collaborator Author Sponsor

heckj
Jun 29, 2022
Collaborator Sponsor

Replies: 19 comments 22 replies

daveverwer
Jun 29, 2022
Maintainer

heckj
Jun 29, 2022
Collaborator Author Sponsor

daveverwer
Jun 29, 2022
Maintainer

heckj
Jun 29, 2022
Collaborator Author Sponsor

heckj
Jun 30, 2022
Collaborator Author Sponsor

daveverwer
Jul 1, 2022
Maintainer

daveverwer
Jul 1, 2022
Maintainer

heckj
Jul 1, 2022
Collaborator Author Sponsor

daveverwer
Jul 2, 2022
Maintainer

heckj
Jul 6, 2022
Collaborator Author Sponsor

finestructure Jul 7, 2022
Maintainer

heckj Jul 7, 2022
Collaborator Author Sponsor

heckj Jul 11, 2022
Collaborator Author Sponsor

heckj Jul 14, 2022
Collaborator Author Sponsor

heckj
Jul 14, 2022
Collaborator Author Sponsor

heckj Jul 15, 2022
Collaborator Author Sponsor

finestructure Jul 18, 2022
Maintainer

heckj Jul 18, 2022
Collaborator Author Sponsor

daveverwer Jul 18, 2022
Maintainer

heckj
Jul 19, 2022
Collaborator Author Sponsor

finestructure Jul 19, 2022
Maintainer

heckj Jul 21, 2022
Collaborator Author Sponsor

heckj
Jul 22, 2022
Collaborator Author Sponsor

finestructure Jul 23, 2022
Maintainer

heckj Jul 23, 2022
Collaborator Author Sponsor

Sherlouk
Jul 30, 2022
Collaborator Sponsor

heckj Jul 30, 2022
Collaborator Author Sponsor

Sherlouk Jul 31, 2022
Collaborator Sponsor

Sherlouk Jul 31, 2022
Collaborator Sponsor

daveverwer Jul 31, 2022
Maintainer

heckj Jul 31, 2022
Collaborator Author Sponsor

daveverwer
Aug 11, 2022
Maintainer

finestructure
Aug 11, 2022
Maintainer

daveverwer Aug 11, 2022
Maintainer

heckj
Aug 11, 2022
Collaborator Author Sponsor

finestructure
Aug 12, 2022
Maintainer

daveverwer
Aug 12, 2022
Maintainer

heckj Aug 12, 2022
Collaborator Author Sponsor