Comparing characters in strings

Denis Moura Source

I'm trying to create a function that compares characters in the same position of two strings of same length and returns the count of their differences.

For instance,

a = "HORSE"
b = "TIGER"

And it would return 5 (as all characters in the same position are different)

Here's what I've been working on.

def Differences(one, two):
    difference = []
    for i in list(one):
        if list(one)[i] != list(two)[i]:
            difference = difference+1
    return difference

That gives an error "List indices must be integers not strings"

And so I've tried turning it to int by using int(ord(

def Differences(one, two):
    difference = 0
    for i in list(one):
        if int(ord(list(one)[i])) != int(ord(list(two)[i])):
            difference = difference+1
    return difference

Which also returns the same error.

When I print list(one)[1] != list(two)[1] it eithers returns True or False, as the comparison is correctly made.

Can you tell me how to correct my code for this purpose?

pythonstringpython-3.xcomparisoncharacter

Answers

answered 2 years ago John-Paul Ensign #1

You could do something like this:

def getDifferences(a,b):
  count = 0

  for i in range(0, len(a)):
    if a[i] is not b[i]:
      count += 1

  return count

The only thing that you will have to implement yourself here is checking for the size of the strings. In my example, if a is larger than b, then there will be an IndexError.

answered 2 years ago Jon McClung #2

try this:

def Differences(one, two):
    if len(two) < len(one):
        one, two = two, one
    res = len(two) - len(one) 
    for i, chr in enumerate(one):
        res += two[i] != chr
    return res

it's important to make the first check of their size in case the second string is shorter than the first, so you don't get an IndexError

answered 2 years ago Aviad #3

As matters for complexity and runtime, calling list() each iteration is not efficient, since it splits the strings, allocates memory and on... The correct way to do it, is to iterate the index of the lists, than compare them by it, something like:

def str_compare(l1, l2):
   assert len(l1) == len(l2) , 'Lists must have the same size'        
   differ_cnt = 0
   for i in xrange(len(l1)):
       if l1[i] != l2[i]:
           differ_cnt += 1
   return differ_cnt

answered 2 years ago EngineerCamp #4

There are a couple problems with:

def Differences(one, two):
    difference = []
    for i in list(one):
        if list(one)[i] != list(two)[i]:
            difference = difference+1
    return difference

Firstly list(one) is ['H', 'O', 'R', 'S', 'E'] when you call Differences(a, b) so you are iterating over strings not ints. Changing your code to:

for i in range(len(one)):

will iterate over the integers 0-4 which will work in your case only because a and b have the same length (you will need to come up with a better solution if you want to handle different length inputs).

Secondly you can't add to an array so you should change it be a int which you add to. The result would be:

def Differences(one, two):
    difference = 0
    for i in range(len(one)):
        if list(one)[i] != list(two)[i]:
            difference = difference+1
    return difference

If you were super keep to use an array you can however append to an array: difference.append(1) and then return the length of the array: return len(difference) but this would be inefficient for what you are trying to achieve.

answered 2 years ago Reid Ballard #5

I would probably just iterate over both of them at the same time with zip and a list comprehension and then take length of the list:

a='HORSE'
b='TIGER'


words=zip(a,b)
incorrect=len([c for c,d in words if c!=d])
print(incorrect)

Zipping pairs lists together index-for-index, stopping when one runs out. List comprehensions are generators that are basically compact for-statements that you can add logic to. So it basically reads: for each zipped pair of letters (c,d) if c!=d then put a into the list (so if the letters are different, we increase the list length by 1). Then we just take the length of the list which is all the letters that are positionally different.

If we consider missing letters to be different, then we can use itertools.zip_longest to fill out the rest of the word:

import itertools

a='HORSES'
b='TIG'

words=itertools.zip_longest(a,b,fillvalue=None)
incorrect=len([c for c,d in words if c!=d]) ## No changes here
print(incorrect)

Obviously, None will never equal a character, so the difference in length will be registered.

EDIT: This hasn't been mentioned, but if we want case-insensitivity, then you just run .lower() or .casefold() on the strings beforehand.

answered 2 years ago user3404344 #6

sum([int(i!=j) for i,j in zip(a,b)]) would do the trick

answered 2 years ago Jordan Bonitatis #7

use zip to iterate over both strings consecutively

>>> def get_difference(str_a, str_b):
...     """
...     Traverse two strings of the same length and determine the number of 
...     indexes in which the characters differ
...     """
...
...     # confirm inputs are strings
...     if not all(isinstance(x, str) for x in (str_a, str_b)):
...         raise Exception("`difference` requires str inputs")
...     # confirm string lengths match
...     if len(str_a) != len(str_b):
...         raise Exception("`difference` requires both input strings to be of equal length")
...
...     # count the differences; this is the important bit
...     ret = 0
...     for i, j in zip(str_a, str_b):
...         if i != j:
...             ret += 1
...     return ret
... 
>>> difference('HORSE', 'TIGER')
5

also, the general style is to lower case function names (which are often verbs) and title case class names (which are often nouns) :)

answered 2 years ago Eduardo Cuesta #8

Find out for yourself

>>> a = "HORSE"
>>> list(a)
['H', 'O', 'R', 'S', 'E']
>>> list(a)[2]
'R'
>>> for i in list(a):
...     i
...
'H'
'O'
'R'
'S'
'E'
>>> list(a)['R']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: list indices must be integers, not str

Good luck!

answered 2 years ago Nick Davies #9

a = "HORSE"
b = "TIGER"
a_list=[]
b_list=[]
for l in a_list:
    a_list.append(l)

for k in b_list:
    b_list.append(k)

difference = len(a)
for i in a_list:
    for x in b_list:
        if a_list[i] == b_list[x]:
            difference = difference - 1

print(difference)

See if this works :)

answered 2 weeks ago Ahmad #10

That's too simple:

def Differences(one, two):
    difference = 0
    for char1, char2 in zip(one, two):
        if char1 != char2:
            difference += difference
    return difference

comments powered by Disqus