# Creating a subset of array from another array : Python

jig Source

I have a basic question regarding working with arrays:

``````a= ([ c b a a b b c a a b b c a a b a c b])
b= ([ 0 1 0 1 0 0 0 0 2 0 1 0 2 0 0 1 0 1])
``````

I) Is there a short way, to count the number of time `'c'` in `a` corresponds to 0, 1, and 2 in `b` and `'b'` in `a` corresponds to 0, 1, 2 and so on

II) How do I create a new array `c` (subset of `a`) and `d`(subset of `b`) such that it only contains those elements if the corresponding element in `a` is `'c'` ?

pythonarrays

answered 4 months ago Gene Burinsky #1

The questions are a bit vague but here's a quick method (some would call it dirty) using `Pandas` though I think something written without recourse to `Pandas` should be preferred.

``````import pandas as pd

#create OP's lists
a= [ 'c', 'b', 'a',  'a', 'b', 'b', 'c', 'a', 'a', 'b', 'b', 'c', 'a', 'a', 'b', 'a', 'c', 'b']
b= [ 0, 1, 0, 1, 0, 0, 0, 0, 2, 0, 1, 0, 2, 0, 0, 1, 0, 1]

#dump lists to a Pandas DataFrame
df = pd.DataFrame({'a':a, 'b':b})
``````

## Question 1

provided I interpreted it correctly, you can cross-tabulate the two arrays: `pd.crosstab(df.a, df.b).stack()`. Cross-tabulate basically counts the number of times each number corresponds to a particular letter. `.stack` is a command to turn output from `.crosstab` into a more legible format.

``````#question 1
pd.crosstab(df.a, df.b).stack()

## -- End pasted text --
Out[9]:
a  b
a  0    3
1    2
2    2
b  0    4
1    3
2    0
c  0    4
1    0
2    0
dtype: int64
``````

# Question 2

Here, I use `Panda`s boolean indexing ability to only select the elements in array `a` that correspond to value `'c'`. So `df.a=='c'` will return `True` for every value in `a` that is `'c'` and `False` otherwise. `df.loc[df.a=='c','a']` will return values from `a` for which the boolean statement was true.

``````c = df.loc[df.a == 'c', 'a']
d = df.loc[df.a == 'c', 'b']

In [15]: c
Out[15]:
0     c
6     c
11    c
16    c
Name: a, dtype: object

In [16]: d
Out[16]:
0     0
6     0
11    0
16    0
Name: b, dtype: int64
``````

answered 4 months ago LMD #2

Python List : https://www.tutorialspoint.com/python/python_lists.htm has a count method.

I suggest you to first zip both lists, as said in comments, and then count occurances of tuple c, 1 and occurances of tuple c, 0 and sum them up, thats what you need for (I), basically.

For (II), if I understood you correctly, you have to take the zipped lists and apply filter on them with lambda x: x[0]==x[1]

answered 4 months ago Srini #3

``````In [10]: p = ['a', 'b', 'c', 'a', 'c', 'a']

In [11]: q  = [1, 2, 1, 3, 3, 1]

In [12]: z = zip(p, q)

In [13]: z
Out[13]: [('a', 1), ('b', 2), ('c', 1), ('a', 3), ('c', 3), ('a', 1)]

In [14]: counts = {}

In [15]: for pair in z:
...:     if pair in counts.keys():
...:         counts[pair] += 1
...:     else:
...:         counts[pair] = 1
...:

In [16]: counts
Out[16]: {('a', 1): 2, ('a', 3): 1, ('b', 2): 1, ('c', 1): 1, ('c', 3): 1}

In [17]: sub_p = []

In [18]: sub_q = []

In [19]: for i, element in enumerate(p):
...:     if element == 'a':
...:         sub_p.append(element)
...:         sub_q.append(q[i])
In [20]: sub_p
Out[20]: ['a', 'a', 'a']

In [21]: sub_q
Out[21]: [1, 3, 1]
``````

Explanation

1. `zip` takes two lists and runs a figurative zipper between them. Resulting in a list of tuples
2. I've used a simplistic approach, I'm just maintaining a map/dictionary that makes not of how many times it has seen a pair of char-int tuples
3. Then I make 2 sub lists that you can modify to use the character in question and figure out what it maps to

### Alternative methods

As abarnert suggested you could use A Counter from collections instead. Or you could just a `count` method on `z` . eg: `z.count('a',1)`. Or you can use a `defaultdict` instead.