C - How to compare extended char sequence with function strcmp()?

I need to compare whether string is equals or not to the following extended char sequence: "———" ( ALT + 0151 code repeated three times) that is in text file. How to do it with function strcmp() ?

A piece of the example text file (TSV):

Piracicaba Av. Armando Salles de Oliveira Lado par 13400-005 Centro Piracicaba Tv. Agostinho Frasson ——— 13400-008 Centro Piracicaba Av. Armando Salles de Oliveira Lado ímpar 13400-010 Centro

When I read the file and print the field is displayed "ùùù" on monitor.

The structure:

typedef struct {
    char cidade[50];
    char tipoLogradouro[20];
    char logradouro[50];
    char trecho[30];
    char cep[10];
    char bairro[50];
} Endereco;

The test is inside 'switch case' and the program is crashing in this part:

case 3:
      {
          if(strcmp(token, "———") == 0) // Change to "ùùù" and fails too. 
              strcpy(registro[i].trecho, NULL);
          else
              strcpy(registro[i].trecho, token);
          break;
      }

Thanks a lot.

ccharstrcmp

Answers

answered 3 months ago Ben #1

strcmp only fails on null, you can pretty much just do

if (strcmp(inputString,"———")==0){
   printf("Strings Equal\n")
} else{
   printf("Strings unequal")
}

If you're trying to just see if the string is in the larger string, strstr is the function your looking for not strcmp.

answered 3 months ago tadman #2

strcpy is for one thing and one thing only, and that is copying one string to another. If you give it NULL, that's not a string, and dereferencing a NULL pointer is going to cause a crash.

What you want is this:

 if (strcmp(token, "———") == 0)
    // Assign NULL pointer
    registro[i].trecho = NULL;
 else
    // Copy string to buffer
    strcpy(registro[i].trecho, token);

Remember strcpy is a very risky function to use as it assumes a lot of things about the destination buffer. If trecho isn't large enough to hold the token string, including NULL terminator, you get undefined behaviour. If token isn't properly NULL terminated you get undefined behaviour. There's a lot of ways this seemingly harmless code can go haywire.

answered 3 months ago bruceg #3

Often in C, you can only use 7-bit ASCII in a quoted string, so for upper ASCII you need to use the \x escape sequence with the hexadecimal code of the character. So, in your case you can type: "\x97\x97\x97", since 97 is hex for 151 decimal.

case 3:
{
      if(strcmp(token, "\x97\x97\x97") == 0) 
          strcpy(registro[i].trecho, NULL);
      else
          strcpy(registro[i].trecho, token);
      break;
}

comments powered by Disqus