What is the difference between cache and persist in Spark?
March 26, 2021Different ways to insert data into Hive table
March 31, 2021When you search or lookup a document, Elasticsearch by default returns or shows you all the fields in the document.
$ curl -X GET "localhost:9200/account/_doc/954?pretty" { "_index" : "account", "_type" : "_doc", "_id" : "954", "_version" : 1, "_seq_no" : 790, "_primary_term" : 1, "found" : true, "_source" : { "account_number" : 954, "balance" : 49404, "firstname" : "Jenna", "lastname" : "Martin", "age" : 22, "gender" : "M", "address" : "688 Hart Street", "employer" : "Zinca", "email" : "jennamartin@zinca.com", "city" : "Oasis", "state" : "MD" } }
But what if you want to display or fetch just a few fields from the document.
Solution
It is quite simple to fetch just the required fields by specifying the fields in the _source attribute when doing the search. Here is an example.
curl -X GET "localhost:9200/account/_search?pretty" -H 'Content-Type: application/json' -d' { "_source": ["firstname", "lastname", "age"], "query" : { "term" : { "age" : "22" } } } '
Here is a sample output for the above search query. Here we are fetching only 3 fields from the document – firstname, lastname and age and we are fetching only documents with age 22.
{ "took" : 23, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 51, "relation" : "eq" }, "max_score" : 1.0, "hits" : [ { "_index" : "account", "_type" : "_doc", "_id" : "75", "_score" : 1.0, "_source" : { "firstname" : "Sandoval", "age" : 22, "lastname" : "Kramer" } }, { "_index" : "account", "_type" : "_doc", "_id" : "87", "_score" : 1.0, "_source" : { "firstname" : "Hewitt", "age" : 22, "lastname" : "Kidd" } }, { "_index" : "account", "_type" : "_doc", "_id" : "227", "_score" : 1.0, "_source" : { "firstname" : "Coleman", "age" : 22, "lastname" : "Berg" } },