The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My work

  • My badges

  • Twitter Updates

  • My Flickr Stream

    20140508-Delphi-2007--Project-Options--Cannot-Edit-Application-Title-HelpFile-Icon-Theming

    20140430-Fiddler-Filter-Actions-Button-Run-Filterset-now

    20140424-Windows-7-free-disk-space

    More Photos
  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 1,757 other followers

git encoding trouble: recursively removing a directory where git prints out a different name than it accepts

Posted by jpluimers on 2017/05/11

The story so far:

A few years back I put all my conferences material in a GitHub repository https://github.com/jpluimers/Conferences/. There were a lot directories and files so I didn’t pay much attention to the initial check-in list. The files had been part of copy.com syncing between Windows and Mac machines.

Often git on a Mac is a bit easier than on Windows (on a Mac you can install them with the xcode-select --install trick which installs only the Command Line Tools without having to install the full Xcode [WayBack]).

I choose a Mac because it is closer to a Linux machine than Widows so I expected no encoding trouble (as git has a Linux origin: it “was created by Linus Torvalds in 2005 for development of the Linux kernel“).

Boy I was wrong:

Recently I cloned the repository in a different place and found out a few strange things:

  1. Directories with accented characters had been duplicated, for instance in https://github.com/jpluimers/Conferences/tree/master/2011
    1. …/EKON15-Renaissance-Hotel-D%FCsseldorf
    2. …/EKON15-Renaissance-Hotel-Düsseldorf
  2. Beyond Compare would show the same content
  3. After a check-out git would not understand the %FC encoded directory name (%FC is IEC_8859-1 encoding for ü and \374 is the octal representation of 0xFC [WayBack]) and a git status would show stuff like this:
    • Untracked files:
        (use "git add ..." to include in what will be committed)
      
          EKON15-Renaissance-Hotel-D%FCsseldorf/

      or

      deleted: "EKON15-Renaissance-Hotel-D\374sseldorf/Delphi-XE2-Debugging/BO-EKON15-Delphi-XE2-Debugging.pdf"
  4. A git rm -r --cached call [WayBack] would not work, as both these would fail:
    • $ git rm -r --cached EKON15-Renaissance-Hotel-D%FCsseldorf
      fatal: pathspec 'EKON15-Renaissance-Hotel-D%FCsseldorf' did not match any files
      

      and

      $ git rm -r --cached "EKON15-Renaissance-Hotel-D\374sseldorf"
      fatal: pathspec 'EKON15-Renaissance-Hotel-D\374sseldorf' did not match any files
      
  5. a

So git could:

  • detect the directories and files
  • display the names of the detected directories and files
  • not translate back the specified names into directories and files

All if this was with:

$ git --version
git version 1.9.5 (Apple Git-50.3)

This is how I fixed it

First I created an alias:

alias git-config="echo global: ; git config --list --global ; echo local: ; git config --lis --local ; echo system: ; git config --list --system"

That allowed me to view the git settings on various levels in my system.

It revealed I didn’t have the core.precomposeunicode setting at all (valid values are true or false). I also read various stories about one or both being the correct value: osx – Git and the Umlaut problem on Mac OS X – Stack Overflow [WayBack].

 

 

–jeroen

Result of git status:

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: