I have a basic question regarding working with arrays:

```
a= ([ c b a a b b c a a b b c a a b a c b])
b= ([ 0 1 0 1 0 0 0 0 2 0 1 0 2 0 0 1 0 1])
```

I) Is there a short way, to count the number of time `'c'`

in `a`

corresponds to 0, 1, and 2 in `b`

and `'b'`

in `a`

corresponds to 0, 1, 2 and so on

II) How do I create a new array `c`

(subset of `a`

) and `d`

(subset of `b`

) such that it only contains those elements if the corresponding element in `a`

is `'c'`

?

answered 4 months ago Gene Burinsky #1

The questions are a bit vague but here's a quick method (some would call it dirty) using `Pandas`

though I think something written without recourse to `Pandas`

should be preferred.

```
import pandas as pd
#create OP's lists
a= [ 'c', 'b', 'a', 'a', 'b', 'b', 'c', 'a', 'a', 'b', 'b', 'c', 'a', 'a', 'b', 'a', 'c', 'b']
b= [ 0, 1, 0, 1, 0, 0, 0, 0, 2, 0, 1, 0, 2, 0, 0, 1, 0, 1]
#dump lists to a Pandas DataFrame
df = pd.DataFrame({'a':a, 'b':b})
```

provided I interpreted it correctly, you can cross-tabulate the two arrays:
`pd.crosstab(df.a, df.b).stack()`

. Cross-tabulate basically counts the number of times each number corresponds to a particular letter. `.stack`

is a command to turn output from `.crosstab`

into a more legible format.

```
#question 1
pd.crosstab(df.a, df.b).stack()
## -- End pasted text --
Out[9]:
a b
a 0 3
1 2
2 2
b 0 4
1 3
2 0
c 0 4
1 0
2 0
dtype: int64
```

Here, I use `Panda`

s boolean indexing ability to only select the elements in array `a`

that correspond to value `'c'`

. So `df.a=='c'`

will return `True`

for every value in `a`

that is `'c'`

and `False`

otherwise. `df.loc[df.a=='c','a']`

will return values from `a`

for which the boolean statement was true.

```
c = df.loc[df.a == 'c', 'a']
d = df.loc[df.a == 'c', 'b']
In [15]: c
Out[15]:
0 c
6 c
11 c
16 c
Name: a, dtype: object
In [16]: d
Out[16]:
0 0
6 0
11 0
16 0
Name: b, dtype: int64
```

answered 4 months ago LMD #2

Python List : https://www.tutorialspoint.com/python/python_lists.htm has a count method.

I suggest you to first zip both lists, as said in comments, and then count occurances of tuple c, 1 and occurances of tuple c, 0 and sum them up, thats what you need for (I), basically.

For (II), if I understood you correctly, you have to take the zipped lists and apply filter on them with lambda x: x[0]==x[1]

answered 4 months ago Srini #3

```
In [10]: p = ['a', 'b', 'c', 'a', 'c', 'a']
In [11]: q = [1, 2, 1, 3, 3, 1]
In [12]: z = zip(p, q)
In [13]: z
Out[13]: [('a', 1), ('b', 2), ('c', 1), ('a', 3), ('c', 3), ('a', 1)]
In [14]: counts = {}
In [15]: for pair in z:
...: if pair in counts.keys():
...: counts[pair] += 1
...: else:
...: counts[pair] = 1
...:
In [16]: counts
Out[16]: {('a', 1): 2, ('a', 3): 1, ('b', 2): 1, ('c', 1): 1, ('c', 3): 1}
In [17]: sub_p = []
In [18]: sub_q = []
In [19]: for i, element in enumerate(p):
...: if element == 'a':
...: sub_p.append(element)
...: sub_q.append(q[i])
In [20]: sub_p
Out[20]: ['a', 'a', 'a']
In [21]: sub_q
Out[21]: [1, 3, 1]
```

*Explanation*

`zip`

takes two lists and runs a figurative zipper between them. Resulting in a list of tuples- I've used a simplistic approach, I'm just maintaining a map/dictionary that makes not of how many times it has seen a pair of char-int tuples
- Then I make 2 sub lists that you can modify to use the character in question and figure out what it maps to

As abarnert suggested you could use A Counter from collections instead.
Or you could just a `count`

method on `z`

. eg: `z.count('a',1)`

. Or you can use a `defaultdict`

instead.