Node Tutorial Part 18: Full Text Search
Welcome to part 18 of Let’s Make a Web App, a tutorial series about building a web app with Node. This series will walk you through the major areas you’ll need to face when building your own applications. These tutorials are tagged with lmawa.
Click to show previous tutorials.
Full Text Search
Given that we’re making a document-based system, wouldn’t it be nice if we had full text search? Mongo doesn’t explicitly support full text search, but simply saving a list of keywords will work.
The list of keywords can be modeled as an array of strings:
Document = new Schema({
'title': { type: String, index: true },
'data': String,
'tags': [String],
'keywords': [String],
'user_id': ObjectId
});
Next we need to extract the strings from the document’s content. Mongoose middleware is the perfect way to do this:
Document.pre('save', function(next) {
this.keywords = extractKeywords(this.data);
next();
});
The question is, how should extractKeywords work? I’d usually rely on a full text indexer or a stemming library, but a simple function will be easier for now. Let’s use the following algorithm:
- Split on white space
- Find words longer than two characters
- Remove duplicates
Implementing this is fairly easy with the filter iterator:
function extractKeywords(text) {
if (!text) return [];
return text.
split(/\s+/).
filter(function(v) { return v.length > 2; }).
filter(function(v, i, a) { return a.lastIndexOf(v) === i; });
}
The regular expression matches white space, and the last filter will remove duplicates by checking if the index of the current value is the same as the last position that it appears. This is a quick and dirty solution, you’d want to spend more time on this for a production system.
Express Action
I’ve added routes for /search and /documents/titles. The titles route will just return a list of all titles with IDs, because the document index method returns documents with all their content.
app.get('/documents/titles.json', loadUser, function(req, res) {
Document.find({ user_id: req.currentUser.id },
[], { sort: ['title', 'descending'] },
function(err, documents) {
res.send(documents.map(function(d) {
return { title: d.title, _id: d._id };
}));
});
});
// Search
app.post('/search.:format?', loadUser, function(req, res) {
Document.find({ user_id: req.currentUser.id, keywords: req.body.s ? req.body.s : null },
[], { sort: ['title', 'descending'] },
function(err, documents) {
switch (req.params.format) {
case 'json':
res.send(documents.map(function(d) {
return { title: d.title, _id: d._id };
}));
break;
}
});
});
The search method expects a post with a s parameter to search on.
Interface

I’ve added a search bar on the top-right. It was a little bit of Jade added to the views/layout.jade file:
#container
#header
ul
li
h1
a(href='/') #{nameAndVersion(appName, version)}
- if (typeof currentUser !== 'undefined')
li.right
a#logout(href='/sessions') Log Out
li.right
form.search(action='/search')
input(name='s', value='Search')
With some Stylus:
form.users input[type=submit]
margin-left 140px
clear both
form.search
margin-right 10px
#show-all
color medium-grey
Now, this is where I start wishing we were already using Backbone.js. I’ve created a function for inserting documents into the list, and one to call the search method:
// Search bar
function showDocuments(results) {
for (var i = 0; i < results.length; i++) {
$('#document-list').append('<li><a id="document-title-' + results[i]._id + '" href="/documents/' + results[i]._id + '">' + results[i].title + '</a></li>');
}
}
function search(value) {
$.post('/search.json', { s: value }, function(results) {
$('#document-list').html('');
$('#document-list').append('<li><a id="show-all" href="#">Show All</a></li>');
if (results.length === 0) {
alert('No results found');
} else {
showDocuments(results);
}
}, 'json');
}
This will automatically show and hide the “Search” text in the input:
$('input[name="s"]').focus(function() {
var element = $(this);
if (element.val() === 'Search')
element.val('');
});
$('input[name="s"]').blur(function() {
var element = $(this);
if (element.val().length === 0)
element.val('Search');
});
$('form.search').submit(function(e) {
search($('input[name="s"]').val());
e.preventDefault();
});
$('#show-all').live('click', function(e) {
$.get('/documents/titles.json', function(results) {
$('#document-list').html('');
showDocuments(results);
if (results.length > 0)
$('#document-title-' + results[0]._id).click();
});
e.preventDefault();
});
It also inserts a document with the title “Show All”. This is styled a little bit differently and will call /documents/titles.json to fetch all the titles.
Indexing
I’ve added a Jake task that can be run with jake index. It’ll force all documents to save. This is just for the reader’s convenience to make existing documents get their keywords generated.
Conclusion
Full text search with Mongo is fairly easy, but this implementation is far from perfect. The keyword extraction algorithm could do with stemming, and the interface isn’t as intuitive as I’d like.
This week’s code was commit ceb9b32.