
Intelligent E-commerce Search: Beyond SQL LIKE
Search is the most-used feature in e-commerce stores with catalogs above a few hundred products. When it works well, it's invisible — the customer types, finds, buys. When it works poorly, the customer types "white sneakers", doesn't find the "White Sneakers" in the catalog, and goes to a competitor. Baymard Institute UX research consistently shows that poor search correlates with up to 30% conversion loss among users who engage with it.
The problem starts when e-commerce stores implement search with SQL LIKE '%sneakers%'. It works for exact matches, fails for everything else: capitalization, plurals/singulars, typos, synonyms, word order. The solution for relevant search in production requires specialized search engines.
The SQL LIKE Problem for Production Search
SQL LIKE is a substring search. It finds exactly what you typed, in the order you typed it, with no tolerance for variations. In practice, this means:
- "sneaker" doesn't find "Sneakers" (plural)
- "denim shorts" doesn't find "shorts denim" (different word order)
- "iPhne" doesn't find "iPhone" (typo)
- "smartphone" doesn't find "cell phone" (synonym)
- Performance degrades exponentially as the catalog grows
Additionally, LIKE '%term%' doesn't use indexes efficiently — the query performs a full-table scan on large tables, generating slowness that directly impacts the user experience.
PostgreSQL's ILIKE and MySQL's FULLTEXT INDEX solve part of the problem (case-insensitivity, individual word search), but still don't handle fuzzy search, synonyms, or relevance ranking. They're marginal improvements on a fundamentally inadequate approach for serious search.
Elasticsearch vs Algolia vs Typesense: Comparison
These three options dominate the e-commerce search market, with very different profiles:
| Criterion | Elasticsearch | Algolia | Typesense |
|---|---|---|---|
| Model | Self-hosted / Cloud | SaaS | Self-hosted / Cloud |
| Cost (average catalog) | $50-200/mo (infra) | $500-2,000/mo | $30-100/mo (Cloud) |
| Search latency | 50-200ms | 1-10ms | 5-50ms |
| Initial setup | High | Low | Medium |
| Fuzzy search | Configurable | Automatic | Automatic |
| Synonyms | Yes | Yes | Yes |
| Custom ranking | High control | Medium control | Medium control |
| Scalability | Very high | High (managed) | High |
| Best for | Complex ops, high volume | Teams without search expertise | Open-source Algolia alternative |
Elasticsearch is the choice for operations needing maximum control: custom ranking with multiple signals (sales, margin, stock, user behavior), search log analysis, correlation with other data sources. The learning curve is steep — configuring text analyzers for English, field mappings, and relevance strategies requires specialized knowledge.
Algolia is the choice when implementation speed matters more than control. The API is simple, the dashboard allows ranking adjustments without code, and the 1-10ms performance is hard to beat. Cost scales with operation volume, which can make it expensive for high-traffic e-commerce.
Typesense is the open-source alternative with a UX close to Algolia. It can be self-hosted, eliminating the SaaS cost. Documentation is excellent and features like fuzzy search and synonyms work well without extensive configuration. For teams wanting Algolia's comfort without the cost, Typesense is the natural choice.
Autocomplete, Fuzzy Search, and Synonyms
Three features that transform the search experience:
Autocomplete suggests terms as the user types. Efficient implementation uses a separate suggestions index (based on converted search history, not just the catalog) with a 200-300ms debounce to avoid excessive requests. Autocomplete should prioritize categories and brands, not just individual products.
// Search with Typesense — fuzzy search + filters + ranking
const result = await typesense.collections("products").documents().search({
q: searchTerm,
query_by: "name,description,brand,category,tags",
query_by_weights: "4,2,3,2,1", // Higher weight for name and brand
num_typos: 2, // Tolerance for 2 typos
typo_tokens_threshold: 1,
filter_by: `stock:>0 && active:true`, // Only available products
sort_by: "_text_match:desc,sales_30d:desc,margin:desc",
facet_by: "category,brand,price_range",
per_page: 24,
highlight_full_fields: "name,brand",
});
Fuzzy search handles typos using edit distance (Levenshtein algorithm or similar). Fuzzy search with 1-2 character tolerance resolves "sneaker" vs "sneeaker", "headphone" vs "headphne", "bluetooth" vs "bluetoth". Calibration matters: too high a tolerance returns irrelevant results.
Synonyms require manual curation but have high impact. "Smartphone" and "cell phone", "notebook" and "laptop", "couch" and "sofa" are synonyms that users alternate and that literal searches don't resolve. Many engines support bidirectional and unidirectional synonyms — "smartphone → cell phone" can be unidirectional if products are indexed as "cell phone" but users type "smartphone".
Ranking by Relevance, Sales, and Margin
Ordering results by pure textual relevance is insufficient. A product appearing first in results but with zero sales in the last 30 days is probably not what the customer wants. Ranking in professional e-commerce combines multiple signals:
Textual relevance: The search engine calculates a similarity score between the searched term and indexed fields. This is the starting point.
Popularity signals: Sales in the last 7/30/90 days, number of views, click-through rate in search results (CTR), and conversion rate from search to product. Popular products are generally the most relevant to the user.
Business signals: Contribution margin, sponsored products, actively promoted products, high-stock products (avoid the frustration of "out of stock" right after clicking). The business may want to rank higher-margin products above equally relevant alternatives.
Availability signals: Out-of-stock products should appear last or outside main results. Showing unavailable products at the top destroys the experience.
The combination of these signals forms a composite score. Tools like Elasticsearch allow scripted queries that calculate custom scores combining all these factors. Algolia and Typesense have sort_by with multiple fields, but less flexibility for complex scoring.
An effective pattern is to create a ranking_score field calculated offline (updated daily or in real-time via sale event) and use that field for sorting:
ranking_score = text_relevance * 0.4
+ normalize(sales_30d) * 0.3
+ normalize(margin) * 0.2
+ promo_boost * 0.1
Conclusion
Relevant search isn't a luxury for large e-commerce — it's a necessity for any store with more than 200 products. The difference between SQL LIKE and a dedicated engine translates directly into conversion: users who find what they're looking for buy; users who don't find it leave.
Correct implementation — choosing the right engine for the operation's profile, configuring text analyzers, curated synonyms, ranking with business signals — requires experience to avoid months of trial and error. At SystemForge, we implement e-commerce search solutions from scratch, including Elasticsearch, Algolia, and Typesense integration, with analytics dashboards to track zero-result searches and curation opportunities. Talk to our team.
Need help?


