Data Science

판다스 pandas concat merge join 정리

양갱맨 2022. 3. 12. 15:30

DataFrame 붙이는 방법들

pd.concat
- pandas.concat(objs, axis=0, join='outer', ignore_index=False, keys=None, levels=None, names=None, verify_integrity=False, sort=False, copy=True)
  - obj : df/series, 데이터프레임 또는 시리즈
  - axis : int, 0 - 인덱스, 1 - 컬럼
  - join : str, inner 또는 outer
  - ignore_index : Bool, 인덱스 새로 부여
  - keys : list, multiindex 만드는 경우, integer index말고 keys에 지정해줄 수 있음
  - names : list, multiindex의 각 인덱스 이름
  - columns : list, column 이름 설정
  - sort : bool, column명 sort되어 표시(기본값 False)
  - verify_integrity : bool, 인덱스 중복 여부 체크(True 설정 시, 성능 저하)
```
s1 = pd.Series(['a','b'])
s2 = pd.Series(['c','d'])
pd.concat([s1, s2])
================================
0    a
1    b
0    c
1    d
dtype: object
```
✔ 합칠때 []로 묶어줘야 에러 안남.

pd.join

DataFrame.join(other, on=None, how='left', lsuffix='', rsuffix='', sort=False)
- other : df/series, 데이터 프레임 또는 시리즈
- on : str, list of str, 인덱스 명
- how : str, left, right, outer 또는 inner
- lsuffix : str, 왼쪽 데이터프레임 컬럼 이름 표현
- rsuffix : str, 오른쪽 데이터프레임 컬럼 이름 표현
- sort : bool, key 지정한 값 정렬해서 반환할 것인지

df = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3', 'K4', 'K5'],
                   'A': ['A0', 'A1', 'A2', 'A3', 'A4', 'A5']})

other = pd.DataFrame({'key': ['K0', 'K1', 'K2'],
                      'B': ['B0', 'B1', 'B2']})

df.join(other, lsuffix='_caller', rsuffix='_other')
====================================================================
key_caller	A	key_other	B
0	K0	A0	K0	B0
1	K1	A1	K1	B1
2	K2	A2	K2	B2
3	K3	A3	NaN	NaN
4	K4	A4	NaN	NaN
5	K5	A5	NaN	NaN

df.set_index('key').join(other.set_index('key'))
=====================================================
     A	B
key		
K0	A0	B0
K1	A1	B1
K2	A2	B2
K3	A3	NaN
K4	A4	NaN
K5	A5	NaN

df.join(other.set_index('key'), on='key')
==============================================
  key   A    B
0  K0  A0   B0
1  K1  A1   B1
2  K2  A2   B2
3  K3  A3  NaN
4  K4  A4  NaN
5  K5  A5  NaN

pd.merge
- DataFrame.merge(right, how='inner', on=None, left_on=None, right_on=None, left_index=False, right_index=False, sort=False, suffixes= ('_x', '_y'), copy=True, indicator=False, validate=None)
- pandas.merge(left, right, how='inner', on=None, left_on=None, right_on=None, left_index=False, right_index=False, sort=False, suffixes=('_x', '_y'), copy=True, indicator=False, validate=None)
  - right : df/series, 데이터프레임 또는 시리즈
  - how : left, right, outher, inner 또는 cross
  - on : label, list, 조인 기준이 되는 컬럼 또는 인덱스 이름
  - right_on : label, list, 조인할 오른쪽 데이터프레임의 조인 기준 컬럼 또는 인덱스
  - left_index : bool, 왼쪽 데이터프레임 인덱스 사용
  - right_index : bool, 오른쪽 데이터프레임 인덱스사용
  - suffixes : tuple, 왼쪽, 오른쪽 데이터프레임 컬럼 표시
```
df1 = pd.DataFrame({'lkey': ['foo', 'bar', 'baz', 'foo'],
                    'value': [1, 2, 3, 5]})
df2 = pd.DataFrame({'rkey': ['foo', 'bar', 'baz', 'foo'],
                    'value': [5, 6, 7, 8]})

df1.merge(df2, left_on='lkey', right_on='rkey',
          suffixes=('_left', '_right'))
=================================================================
	lkey  value_left rkey  value_right
0  foo           1  foo            5
1  foo           1  foo            8
2  foo           5  foo            5
3  foo           5  foo            8
4  bar           2  bar            6
5  baz           3  baz            7
```

저작자표시 비영리 변경금지 (새창열림)

'Data Science' 카테고리의 다른 글

ABC 분석 (0)	2022.02.02
실무 SQL 공부 3-3 - 데이터 가공 SQL (2)	2022.01.22
실무 SQL 공부 3-2 - 데이터 가공 SQL (0)	2022.01.15
실무 SQL 공부 3-1 - SQL로 데이터 가공 (0)	2022.01.15
실무 SQL 공부 2 - 데이터 (0)	2022.01.15

현재글판다스 pandas concat merge join 정리

텐서플로우, 딥러닝, pyTorch, deeplearning, 강화학습, 알고리즘, 백준, CV, python, DAFIT, 다핏, 파이토치, 파이썬, 머신러닝, OpenCV, TensorFlow, Reinforcement Learning, 데이터분석, 가벼운학습지, RL,

Today :
Yesterday :

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

양갱로그

판다스 pandas concat merge join 정리

DataFrame 붙이는 방법들

'Data Science' 카테고리의 다른 글

'Data Science'의 다른글

티스토리툴바

판다스 pandas concat merge join 정리

DataFrame 붙이는 방법들

'Data Science' 카테고리의 다른 글

'Data Science'의 다른글

관련글

티스토리툴바