TreeView.insert throws UnicodeDecodeError












0















I'm trying to populate TreeView with data from os.listdir(path).



All is ok until I read a directory name with a non-utf character. In my case 0xf6 which is not utf8.



As I'm running on Windows the charset from os.listdir() is Windows-1252 or ANSI.



How can I solve this problem to achieve correct display in TreeView?



Here some of my code:



def fill_tree(treeview, node):
if treeview.set(node, "type") != 'directory':
return

path = treeview.set(node, "fullpath")
# Delete the possibly 'dummy' node present.
treeview.delete(*treeview.get_children(node))

parent = treeview.parent(node)
for p in os.listdir(path):
ptype = None
p = os.path.join(path, p)

if os.path.isdir(p):
ptype = 'directory'

fname = os.path.split(p)[1].decode('cp1252').encode('utf8')

if ptype == 'directory':
oid = treeview.insert(node, 'end', text=fname, values=[p, ptype])
treeview.insert(oid, 0, text='dummy')


Regards
Göran










share|improve this question

























  • There are many matches for "treeview" on PyPi. Which library are you using specifically? And: what type are the offending dictionary keys, str or unicode?

    – lenz
    Nov 19 '18 at 18:59











  • I'm using Tkinter. Don't understand your question on 'offending dictonary keys'?

    – gorbos
    Nov 19 '18 at 20:24













  • O sorry, I did'nt read carefully. I meant "directory name". But I'm pretty sure os.listdir() returns str, not unicode. You can decode the directory name using name.decode('cp1252'), which gives you a Unicode string. Then check if TreeView.insert accepts this.

    – lenz
    Nov 19 '18 at 22:26











  • I tried name.decode('cp1252').encode('utf8') and it works fine. But I get into trouble when I continue looping through the directory tree - os.path.isdir(p) does not work as desired with utf encoding? Catch-22 situation?

    – gorbos
    Nov 20 '18 at 10:26











  • Sounds like you need to keep to separate variables: the CP-1252 version for os.* and the UTF-8 version for the TreeView. Or you switch to Python 3, where everything should work with Unicode strings.

    – lenz
    Nov 20 '18 at 10:27
















0















I'm trying to populate TreeView with data from os.listdir(path).



All is ok until I read a directory name with a non-utf character. In my case 0xf6 which is not utf8.



As I'm running on Windows the charset from os.listdir() is Windows-1252 or ANSI.



How can I solve this problem to achieve correct display in TreeView?



Here some of my code:



def fill_tree(treeview, node):
if treeview.set(node, "type") != 'directory':
return

path = treeview.set(node, "fullpath")
# Delete the possibly 'dummy' node present.
treeview.delete(*treeview.get_children(node))

parent = treeview.parent(node)
for p in os.listdir(path):
ptype = None
p = os.path.join(path, p)

if os.path.isdir(p):
ptype = 'directory'

fname = os.path.split(p)[1].decode('cp1252').encode('utf8')

if ptype == 'directory':
oid = treeview.insert(node, 'end', text=fname, values=[p, ptype])
treeview.insert(oid, 0, text='dummy')


Regards
Göran










share|improve this question

























  • There are many matches for "treeview" on PyPi. Which library are you using specifically? And: what type are the offending dictionary keys, str or unicode?

    – lenz
    Nov 19 '18 at 18:59











  • I'm using Tkinter. Don't understand your question on 'offending dictonary keys'?

    – gorbos
    Nov 19 '18 at 20:24













  • O sorry, I did'nt read carefully. I meant "directory name". But I'm pretty sure os.listdir() returns str, not unicode. You can decode the directory name using name.decode('cp1252'), which gives you a Unicode string. Then check if TreeView.insert accepts this.

    – lenz
    Nov 19 '18 at 22:26











  • I tried name.decode('cp1252').encode('utf8') and it works fine. But I get into trouble when I continue looping through the directory tree - os.path.isdir(p) does not work as desired with utf encoding? Catch-22 situation?

    – gorbos
    Nov 20 '18 at 10:26











  • Sounds like you need to keep to separate variables: the CP-1252 version for os.* and the UTF-8 version for the TreeView. Or you switch to Python 3, where everything should work with Unicode strings.

    – lenz
    Nov 20 '18 at 10:27














0












0








0








I'm trying to populate TreeView with data from os.listdir(path).



All is ok until I read a directory name with a non-utf character. In my case 0xf6 which is not utf8.



As I'm running on Windows the charset from os.listdir() is Windows-1252 or ANSI.



How can I solve this problem to achieve correct display in TreeView?



Here some of my code:



def fill_tree(treeview, node):
if treeview.set(node, "type") != 'directory':
return

path = treeview.set(node, "fullpath")
# Delete the possibly 'dummy' node present.
treeview.delete(*treeview.get_children(node))

parent = treeview.parent(node)
for p in os.listdir(path):
ptype = None
p = os.path.join(path, p)

if os.path.isdir(p):
ptype = 'directory'

fname = os.path.split(p)[1].decode('cp1252').encode('utf8')

if ptype == 'directory':
oid = treeview.insert(node, 'end', text=fname, values=[p, ptype])
treeview.insert(oid, 0, text='dummy')


Regards
Göran










share|improve this question
















I'm trying to populate TreeView with data from os.listdir(path).



All is ok until I read a directory name with a non-utf character. In my case 0xf6 which is not utf8.



As I'm running on Windows the charset from os.listdir() is Windows-1252 or ANSI.



How can I solve this problem to achieve correct display in TreeView?



Here some of my code:



def fill_tree(treeview, node):
if treeview.set(node, "type") != 'directory':
return

path = treeview.set(node, "fullpath")
# Delete the possibly 'dummy' node present.
treeview.delete(*treeview.get_children(node))

parent = treeview.parent(node)
for p in os.listdir(path):
ptype = None
p = os.path.join(path, p)

if os.path.isdir(p):
ptype = 'directory'

fname = os.path.split(p)[1].decode('cp1252').encode('utf8')

if ptype == 'directory':
oid = treeview.insert(node, 'end', text=fname, values=[p, ptype])
treeview.insert(oid, 0, text='dummy')


Regards
Göran







windows python-2.7 utf-8 treeview






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 22 '18 at 10:07







gorbos

















asked Nov 19 '18 at 14:28









gorbosgorbos

43




43













  • There are many matches for "treeview" on PyPi. Which library are you using specifically? And: what type are the offending dictionary keys, str or unicode?

    – lenz
    Nov 19 '18 at 18:59











  • I'm using Tkinter. Don't understand your question on 'offending dictonary keys'?

    – gorbos
    Nov 19 '18 at 20:24













  • O sorry, I did'nt read carefully. I meant "directory name". But I'm pretty sure os.listdir() returns str, not unicode. You can decode the directory name using name.decode('cp1252'), which gives you a Unicode string. Then check if TreeView.insert accepts this.

    – lenz
    Nov 19 '18 at 22:26











  • I tried name.decode('cp1252').encode('utf8') and it works fine. But I get into trouble when I continue looping through the directory tree - os.path.isdir(p) does not work as desired with utf encoding? Catch-22 situation?

    – gorbos
    Nov 20 '18 at 10:26











  • Sounds like you need to keep to separate variables: the CP-1252 version for os.* and the UTF-8 version for the TreeView. Or you switch to Python 3, where everything should work with Unicode strings.

    – lenz
    Nov 20 '18 at 10:27



















  • There are many matches for "treeview" on PyPi. Which library are you using specifically? And: what type are the offending dictionary keys, str or unicode?

    – lenz
    Nov 19 '18 at 18:59











  • I'm using Tkinter. Don't understand your question on 'offending dictonary keys'?

    – gorbos
    Nov 19 '18 at 20:24













  • O sorry, I did'nt read carefully. I meant "directory name". But I'm pretty sure os.listdir() returns str, not unicode. You can decode the directory name using name.decode('cp1252'), which gives you a Unicode string. Then check if TreeView.insert accepts this.

    – lenz
    Nov 19 '18 at 22:26











  • I tried name.decode('cp1252').encode('utf8') and it works fine. But I get into trouble when I continue looping through the directory tree - os.path.isdir(p) does not work as desired with utf encoding? Catch-22 situation?

    – gorbos
    Nov 20 '18 at 10:26











  • Sounds like you need to keep to separate variables: the CP-1252 version for os.* and the UTF-8 version for the TreeView. Or you switch to Python 3, where everything should work with Unicode strings.

    – lenz
    Nov 20 '18 at 10:27

















There are many matches for "treeview" on PyPi. Which library are you using specifically? And: what type are the offending dictionary keys, str or unicode?

– lenz
Nov 19 '18 at 18:59





There are many matches for "treeview" on PyPi. Which library are you using specifically? And: what type are the offending dictionary keys, str or unicode?

– lenz
Nov 19 '18 at 18:59













I'm using Tkinter. Don't understand your question on 'offending dictonary keys'?

– gorbos
Nov 19 '18 at 20:24







I'm using Tkinter. Don't understand your question on 'offending dictonary keys'?

– gorbos
Nov 19 '18 at 20:24















O sorry, I did'nt read carefully. I meant "directory name". But I'm pretty sure os.listdir() returns str, not unicode. You can decode the directory name using name.decode('cp1252'), which gives you a Unicode string. Then check if TreeView.insert accepts this.

– lenz
Nov 19 '18 at 22:26





O sorry, I did'nt read carefully. I meant "directory name". But I'm pretty sure os.listdir() returns str, not unicode. You can decode the directory name using name.decode('cp1252'), which gives you a Unicode string. Then check if TreeView.insert accepts this.

– lenz
Nov 19 '18 at 22:26













I tried name.decode('cp1252').encode('utf8') and it works fine. But I get into trouble when I continue looping through the directory tree - os.path.isdir(p) does not work as desired with utf encoding? Catch-22 situation?

– gorbos
Nov 20 '18 at 10:26





I tried name.decode('cp1252').encode('utf8') and it works fine. But I get into trouble when I continue looping through the directory tree - os.path.isdir(p) does not work as desired with utf encoding? Catch-22 situation?

– gorbos
Nov 20 '18 at 10:26













Sounds like you need to keep to separate variables: the CP-1252 version for os.* and the UTF-8 version for the TreeView. Or you switch to Python 3, where everything should work with Unicode strings.

– lenz
Nov 20 '18 at 10:27





Sounds like you need to keep to separate variables: the CP-1252 version for os.* and the UTF-8 version for the TreeView. Or you switch to Python 3, where everything should work with Unicode strings.

– lenz
Nov 20 '18 at 10:27












1 Answer
1






active

oldest

votes


















0














The UnicodeDecodeError is due to passing byte strings when the function is expecting Unicode strings. Python 2 attempts to implicitly decode byte strings to Unicode. Use Unicode strings explicitly instead. os.listdir(unicode_path) will return Unicode string, for example os.listdir(u'.').






share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53376745%2ftreeview-insert-throws-unicodedecodeerror%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    The UnicodeDecodeError is due to passing byte strings when the function is expecting Unicode strings. Python 2 attempts to implicitly decode byte strings to Unicode. Use Unicode strings explicitly instead. os.listdir(unicode_path) will return Unicode string, for example os.listdir(u'.').






    share|improve this answer




























      0














      The UnicodeDecodeError is due to passing byte strings when the function is expecting Unicode strings. Python 2 attempts to implicitly decode byte strings to Unicode. Use Unicode strings explicitly instead. os.listdir(unicode_path) will return Unicode string, for example os.listdir(u'.').






      share|improve this answer


























        0












        0








        0







        The UnicodeDecodeError is due to passing byte strings when the function is expecting Unicode strings. Python 2 attempts to implicitly decode byte strings to Unicode. Use Unicode strings explicitly instead. os.listdir(unicode_path) will return Unicode string, for example os.listdir(u'.').






        share|improve this answer













        The UnicodeDecodeError is due to passing byte strings when the function is expecting Unicode strings. Python 2 attempts to implicitly decode byte strings to Unicode. Use Unicode strings explicitly instead. os.listdir(unicode_path) will return Unicode string, for example os.listdir(u'.').







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 21 '18 at 9:14









        Mark TolonenMark Tolonen

        92.3k12111176




        92.3k12111176






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53376745%2ftreeview-insert-throws-unicodedecodeerror%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            How to change which sound is reproduced for terminal bell?

            Title Spacing in Bjornstrup Chapter, Removing Chapter Number From Contents

            Can I use Tabulator js library in my java Spring + Thymeleaf project?