Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.3k views
in Technique[技术] by (71.8m points)

google apps script - DocsList File getContentAsString() missing unicode characters

I am trying to import a CSV file with french accents using Google App Script, reading the file using the getContentAsString() and then processing it into a Google Spreadsheet. It would seems the unicode characters are send back as garbage.

After investigation, it would seems getContentAsString() open files using UTF-8. This cause problems when the file is created using Western Mac OS Roman or Western Windows Latin 1 - default encoding on older Excel when exporting CSV.

Any suggestion on how to circumvent this problem?

Example: ?quipement should be équipement

function Test() {
  var filename = 'BV_period_2.csv';
  var files = DocsList.getFiles();
  var csvFile = "";

  for (var i = 0; i < files.length; i++) {
    if (files[i].getName() == filename ) {
      csvFile = files[i].getContentAsString(); //csvFile will have ?     
      break;
    }
  }

  var csvData = CSVToArray(csvFile, ",");
  var ss = SpreadsheetApp.getActiveSpreadsheet();
  var sheet = ss.getSheetByName('TestBV');
  ...
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You can optionally choose the charset. Here's a UTF-16 example.

DocsList.getFileById(<some id>).getBlob().getDataAsString("UTF-16")

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...