Elasticsearch in Django - sort alphabetically

I have a following doc:

@brand.doc_type

class BrandDocument(DocType):



    class Meta:

        model = Brand



    id = IntegerField()

    name = StringField(

        fields={

            'raw': {

                'type': 'keyword',

                'fielddata': True,

            }

        },

    )

    lookup_name = StringField(

        fields={

            'raw': {

                'type': 'string',

            }

        },

    )

and I try to make a lookup using this:

BrandDocument.search().sort({

    'name.keyword': order,

})

The problem is that I'm getting results sorted in a case sensitive way, which means that instead of 'a', 'A', 'ab', 'AB' I get 'A', 'AB', 'a', 'ab'. How can this be fixed?

EDIT After some additional search I've come up with something like this:

lowercase_normalizer = normalizer(

    'lowercase_normalizer',

    filter=['lowercase']

)

lowercase_analyzer = analyzer(

    'lowercase_analyzer',

    tokenizer="keyword",

    filter=['lowercase'],

)





@brand.doc_type

class BrandDocument(DocType):



    class Meta:

        model = Brand



    id = IntegerField()

    name = StringField(

        analyzer=lowercase_analyzer,

        fields={

            'raw': Keyword(normalizer=lowercase_normalizer, fielddata=True),

        },

    )

The issue persists, however, and I can't find in the docs how this normalizer should be used.

edited Nov 20 '18 at 15:21

asked Nov 20 '18 at 11:14

gonczor

2,344927

Take a look at this answer; stackoverflow.com/a/22100849/1199464

– markwalker_
Nov 20 '18 at 11:52

You are storing the values as is, so they will be sorted case sensitive. If you want a different sort order, you need to store the values differently (case insensitive, and for languages with diacritica, you might want to consider a filter like ICU to resolve accents and such so that ü, ue, ú are sorted accordingly).

– Risadinha
Nov 20 '18 at 13:40

add a comment |

I have a following doc:

@brand.doc_type

class BrandDocument(DocType):



    class Meta:

        model = Brand



    id = IntegerField()

    name = StringField(

        fields={

            'raw': {

                'type': 'keyword',

                'fielddata': True,

            }

        },

    )

    lookup_name = StringField(

        fields={

            'raw': {

                'type': 'string',

            }

        },

    )

and I try to make a lookup using this:

BrandDocument.search().sort({

    'name.keyword': order,

})

The problem is that I'm getting results sorted in a case sensitive way, which means that instead of 'a', 'A', 'ab', 'AB' I get 'A', 'AB', 'a', 'ab'. How can this be fixed?

EDIT After some additional search I've come up with something like this:

lowercase_normalizer = normalizer(

    'lowercase_normalizer',

    filter=['lowercase']

)

lowercase_analyzer = analyzer(

    'lowercase_analyzer',

    tokenizer="keyword",

    filter=['lowercase'],

)





@brand.doc_type

class BrandDocument(DocType):



    class Meta:

        model = Brand



    id = IntegerField()

    name = StringField(

        analyzer=lowercase_analyzer,

        fields={

            'raw': Keyword(normalizer=lowercase_normalizer, fielddata=True),

        },

    )

The issue persists, however, and I can't find in the docs how this normalizer should be used.

edited Nov 20 '18 at 15:21

asked Nov 20 '18 at 11:14

gonczor

2,344927

Take a look at this answer; stackoverflow.com/a/22100849/1199464

– markwalker_
Nov 20 '18 at 11:52

You are storing the values as is, so they will be sorted case sensitive. If you want a different sort order, you need to store the values differently (case insensitive, and for languages with diacritica, you might want to consider a filter like ICU to resolve accents and such so that ü, ue, ú are sorted accordingly).

– Risadinha
Nov 20 '18 at 13:40

add a comment |

I have a following doc:

@brand.doc_type

class BrandDocument(DocType):



    class Meta:

        model = Brand



    id = IntegerField()

    name = StringField(

        fields={

            'raw': {

                'type': 'keyword',

                'fielddata': True,

            }

        },

    )

    lookup_name = StringField(

        fields={

            'raw': {

                'type': 'string',

            }

        },

    )

and I try to make a lookup using this:

BrandDocument.search().sort({

    'name.keyword': order,

})

The problem is that I'm getting results sorted in a case sensitive way, which means that instead of 'a', 'A', 'ab', 'AB' I get 'A', 'AB', 'a', 'ab'. How can this be fixed?

EDIT After some additional search I've come up with something like this:

lowercase_normalizer = normalizer(

    'lowercase_normalizer',

    filter=['lowercase']

)

lowercase_analyzer = analyzer(

    'lowercase_analyzer',

    tokenizer="keyword",

    filter=['lowercase'],

)





@brand.doc_type

class BrandDocument(DocType):



    class Meta:

        model = Brand



    id = IntegerField()

    name = StringField(

        analyzer=lowercase_analyzer,

        fields={

            'raw': Keyword(normalizer=lowercase_normalizer, fielddata=True),

        },

    )

The issue persists, however, and I can't find in the docs how this normalizer should be used.

edited Nov 20 '18 at 15:21

asked Nov 20 '18 at 11:14

gonczor

2,344927

I have a following doc:

@brand.doc_type

class BrandDocument(DocType):



    class Meta:

        model = Brand



    id = IntegerField()

    name = StringField(

        fields={

            'raw': {

                'type': 'keyword',

                'fielddata': True,

            }

        },

    )

    lookup_name = StringField(

        fields={

            'raw': {

                'type': 'string',

            }

        },

    )

and I try to make a lookup using this:

BrandDocument.search().sort({

    'name.keyword': order,

})

The problem is that I'm getting results sorted in a case sensitive way, which means that instead of 'a', 'A', 'ab', 'AB' I get 'A', 'AB', 'a', 'ab'. How can this be fixed?

EDIT After some additional search I've come up with something like this:

lowercase_normalizer = normalizer(

    'lowercase_normalizer',

    filter=['lowercase']

)

lowercase_analyzer = analyzer(

    'lowercase_analyzer',

    tokenizer="keyword",

    filter=['lowercase'],

)





@brand.doc_type

class BrandDocument(DocType):



    class Meta:

        model = Brand



    id = IntegerField()

    name = StringField(

        analyzer=lowercase_analyzer,

        fields={

            'raw': Keyword(normalizer=lowercase_normalizer, fielddata=True),

        },

    )

The issue persists, however, and I can't find in the docs how this normalizer should be used.

django elasticsearch elasticsearch-dsl

edited Nov 20 '18 at 15:21

asked Nov 20 '18 at 11:14

gonczor

2,344927

edited Nov 20 '18 at 15:21

asked Nov 20 '18 at 11:14

gonczor

2,344927

edited Nov 20 '18 at 15:21

asked Nov 20 '18 at 11:14

gonczor

2,344927

asked Nov 20 '18 at 11:14

gonczor

2,344927

asked Nov 20 '18 at 11:14

gonczor

2,344927

Take a look at this answer; stackoverflow.com/a/22100849/1199464

– markwalker_
Nov 20 '18 at 11:52

You are storing the values as is, so they will be sorted case sensitive. If you want a different sort order, you need to store the values differently (case insensitive, and for languages with diacritica, you might want to consider a filter like ICU to resolve accents and such so that ü, ue, ú are sorted accordingly).

– Risadinha
Nov 20 '18 at 13:40

add a comment |

Take a look at this answer; stackoverflow.com/a/22100849/1199464

– markwalker_
Nov 20 '18 at 11:52

You are storing the values as is, so they will be sorted case sensitive. If you want a different sort order, you need to store the values differently (case insensitive, and for languages with diacritica, you might want to consider a filter like ICU to resolve accents and such so that ü, ue, ú are sorted accordingly).

– Risadinha
Nov 20 '18 at 13:40

Take a look at this answer; stackoverflow.com/a/22100849/1199464

– markwalker_
Nov 20 '18 at 11:52

You are storing the values as is, so they will be sorted case sensitive. If you want a different sort order, you need to store the values differently (case insensitive, and for languages with diacritica, you might want to consider a filter like ICU to resolve accents and such so that ü, ue, ú are sorted accordingly).

– Risadinha
Nov 20 '18 at 13:40

add a comment |

1 Answer
1

active

oldest

votes

I would suggest to create a custom analyzer with lowercase filter and apply it to the field while indexing.

So you have to update the following in the index settings:

{

  "index": {

    "analysis": {

      "analyzer": {

        "custom_sort": {

          "tokenizer": "keyword",

          "filter": [

            "lowercase"

          ]

        }

      }

    }

  }

}

Add a field (based on which you need to sort) in mapping with the custom_sort analyzer as below:

{

    "properties":{

        "sortField":{

            "type":"text",

            "analyzer":"custom_sort"

        }

    }

}

If the field already exists in mapping then you can add a sub fields to the existing field with the analyzer as below.

Assuming the field name having type as keyword already exists, update it as:

{

    "properties":{

        "name":{

            "type": "keyword",

            "fields":{

                "sortval":{

                    "type":"text",

                    "analyzer":"custom_sort"

                }

            }

        }

    }

}

Once done you need to reindex your data so that lowercase values are indexed. Then you can use the field to sort as:

Case 1 (new field):

"sort": [

    {

      "sortField": "desc"

    }

  ]

Case 2 (sub field):

"sort": [

    {

      "name.sortval": "desc"

    }

  ]

answered Nov 22 '18 at 6:06

Nishant Saini

1,6091018

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53391789%2felasticsearch-in-django-sort-alphabetically%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

I would suggest to create a custom analyzer with lowercase filter and apply it to the field while indexing.

So you have to update the following in the index settings:

{

  "index": {

    "analysis": {

      "analyzer": {

        "custom_sort": {

          "tokenizer": "keyword",

          "filter": [

            "lowercase"

          ]

        }

      }

    }

  }

}

Add a field (based on which you need to sort) in mapping with the custom_sort analyzer as below:

{

    "properties":{

        "sortField":{

            "type":"text",

            "analyzer":"custom_sort"

        }

    }

}

If the field already exists in mapping then you can add a sub fields to the existing field with the analyzer as below.

Assuming the field name having type as keyword already exists, update it as:

{

    "properties":{

        "name":{

            "type": "keyword",

            "fields":{

                "sortval":{

                    "type":"text",

                    "analyzer":"custom_sort"

                }

            }

        }

    }

}

Once done you need to reindex your data so that lowercase values are indexed. Then you can use the field to sort as:

Case 1 (new field):

"sort": [

    {

      "sortField": "desc"

    }

  ]

Case 2 (sub field):

"sort": [

    {

      "name.sortval": "desc"

    }

  ]

answered Nov 22 '18 at 6:06

Nishant Saini

1,6091018

add a comment |

I would suggest to create a custom analyzer with lowercase filter and apply it to the field while indexing.

So you have to update the following in the index settings:

{

  "index": {

    "analysis": {

      "analyzer": {

        "custom_sort": {

          "tokenizer": "keyword",

          "filter": [

            "lowercase"

          ]

        }

      }

    }

  }

}

Add a field (based on which you need to sort) in mapping with the custom_sort analyzer as below:

{

    "properties":{

        "sortField":{

            "type":"text",

            "analyzer":"custom_sort"

        }

    }

}

If the field already exists in mapping then you can add a sub fields to the existing field with the analyzer as below.

Assuming the field name having type as keyword already exists, update it as:

{

    "properties":{

        "name":{

            "type": "keyword",

            "fields":{

                "sortval":{

                    "type":"text",

                    "analyzer":"custom_sort"

                }

            }

        }

    }

}

Once done you need to reindex your data so that lowercase values are indexed. Then you can use the field to sort as:

Case 1 (new field):

"sort": [

    {

      "sortField": "desc"

    }

  ]

Case 2 (sub field):

"sort": [

    {

      "name.sortval": "desc"

    }

  ]

answered Nov 22 '18 at 6:06

Nishant Saini

1,6091018

add a comment |

I would suggest to create a custom analyzer with lowercase filter and apply it to the field while indexing.

So you have to update the following in the index settings:

{

  "index": {

    "analysis": {

      "analyzer": {

        "custom_sort": {

          "tokenizer": "keyword",

          "filter": [

            "lowercase"

          ]

        }

      }

    }

  }

}

Add a field (based on which you need to sort) in mapping with the custom_sort analyzer as below:

{

    "properties":{

        "sortField":{

            "type":"text",

            "analyzer":"custom_sort"

        }

    }

}

If the field already exists in mapping then you can add a sub fields to the existing field with the analyzer as below.

Assuming the field name having type as keyword already exists, update it as:

{

    "properties":{

        "name":{

            "type": "keyword",

            "fields":{

                "sortval":{

                    "type":"text",

                    "analyzer":"custom_sort"

                }

            }

        }

    }

}

Once done you need to reindex your data so that lowercase values are indexed. Then you can use the field to sort as:

Case 1 (new field):

"sort": [

    {

      "sortField": "desc"

    }

  ]

Case 2 (sub field):

"sort": [

    {

      "name.sortval": "desc"

    }

  ]

answered Nov 22 '18 at 6:06

Nishant Saini

1,6091018

I would suggest to create a custom analyzer with lowercase filter and apply it to the field while indexing.

So you have to update the following in the index settings:

{

  "index": {

    "analysis": {

      "analyzer": {

        "custom_sort": {

          "tokenizer": "keyword",

          "filter": [

            "lowercase"

          ]

        }

      }

    }

  }

}

Add a field (based on which you need to sort) in mapping with the custom_sort analyzer as below:

{

    "properties":{

        "sortField":{

            "type":"text",

            "analyzer":"custom_sort"

        }

    }

}

If the field already exists in mapping then you can add a sub fields to the existing field with the analyzer as below.

Assuming the field name having type as keyword already exists, update it as:

{

    "properties":{

        "name":{

            "type": "keyword",

            "fields":{

                "sortval":{

                    "type":"text",

                    "analyzer":"custom_sort"

                }

            }

        }

    }

}

Once done you need to reindex your data so that lowercase values are indexed. Then you can use the field to sort as:

Case 1 (new field):

"sort": [

    {

      "sortField": "desc"

    }

  ]

Case 2 (sub field):

"sort": [

    {

      "name.sortval": "desc"

    }

  ]

answered Nov 22 '18 at 6:06

Nishant Saini

1,6091018

answered Nov 22 '18 at 6:06

Nishant Saini

1,6091018

answered Nov 22 '18 at 6:06

Nishant Saini

1,6091018

answered Nov 22 '18 at 6:06

Nishant Saini

1,6091018

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Cfrgtkky