はじめに
連休もはじまったし要不急な分析でもしたいと思い、被引用回数世界一位の特許文献はどれだろうかと思い調べて見た。
BigQueryでのSQL
オンライン学習で、始めてSQLはしーくえると発音するのを知りました。。
難しかったのは、総引用回数&重複排除(同じ出願の拒絶理由等に何回も使われるのを排除)した引用回数パターンの作りかたと、BigQueryには引用情報しか入っていないので、被引用情報を生成する部分(逆方向)。
20201204追加…同じ公開データセットのGoogle Patents Research Dataには被引用情報があった!
少し複雑になってしまったけど、下記のような感じ。
母集団用のSQL
WITH bibtable as (
SELECT
pub.application_number AS appnum,
pub.publication_number AS pubnum,
pub.filing_date as appday,
STRING_AGG(DISTINCT(applicants.name)) AS applicants ,
SUBSTR(STRING_AGG(ipcs.code),0,1) AS ipc4,
STRING_AGG(DISTINCT(title.text)) AS title
FROM `patents-public-data.patents.publications_201912` AS pub,
UNNEST(title_localized) AS title,
UNNEST(assignee_harmonized) as applicants,
UNNEST(ipc) as ipcs
GROUP BY appnum,pubnum,appday
)
SELECT
pubnum,
SUBSTR(pubnum,0,2) AS appcountry,
SUBSTR(STRING_AGG(DISTINCT(CAST(bibtable.appday AS STRING))),0,4) AS appyear,
COUNT(main.application_number) AS total_cit_count,
COUNT(DISTINCT(main.family_id)) AS unique_cit_count,
STRING_AGG(DISTINCT(ipc4)) AS ipcs,
STRING_AGG(DISTINCT(title)) AS titles,
STRING_AGG(DISTINCT(applicants)) AS applicants,
SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'A' THEN 1 ELSE 0 END) as IPC_A,
SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'B' THEN 1 ELSE 0 END) as IPC_B,
SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'C' THEN 1 ELSE 0 END) as IPC_C,
SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'D' THEN 1 ELSE 0 END) as IPC_D,
SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'E' THEN 1 ELSE 0 END) as IPC_E,
SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'F' THEN 1 ELSE 0 END) as IPC_F,
SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'G' THEN 1 ELSE 0 END) as IPC_G,
SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'H' THEN 1 ELSE 0 END) as IPC_H,
SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'US' THEN 1 ELSE 0 END) as US,
SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'JP' THEN 1 ELSE 0 END) as JP,
SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'EP' THEN 1 ELSE 0 END) as EP,
SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'KR' THEN 1 ELSE 0 END) as KR,
SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'CN' THEN 1 ELSE 0 END) as CN,
SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'WO' THEN 1 ELSE 0 END) as WO
FROM
`patents-public-data.patents.publications_201912` as main,
UNNEST(main.citation) AS cit,
UNNEST(main.ipc) AS ipcs
LEFT JOIN bibtable
ON cit.publication_number = bibtable.pubnum
WHERE SUBSTR(main.publication_number,0,2) IN ('US','JP','EP','CN','KR','WO') AND pubnum IS NOT NULL AND ipcs.first = TRUE
GROUP BY appnum,pubnum
ORDER BY total_cit_count DESC,pubnum DESC
母集団用のSQL別バージョン:ARRAY_AGGとSTRUCTを使ったほうがスッキリしているかも
WITH bibtable as (
SELECT
pub.application_number AS appnum,
pub.publication_number AS pubnum,
pub.filing_date as appday,
STRING_AGG(DISTINCT(applicants.name)) AS applicants ,
STRING_AGG(DISTINCT(title.text)) AS texts,
SUBSTR(STRING_AGG(ipcs.code),0,1) AS ipc4,
STRING_AGG(DISTINCT(title.text)) AS title
FROM `patents-public-data.patents.publications_201912` AS pub,
UNNEST(title_localized) AS title,
UNNEST(assignee_harmonized) as applicants,
UNNEST(ipc) as ipcs
GROUP BY appnum,pubnum,appday
)
SELECT
bibtable.pubnum,
SUBSTR(bibtable.pubnum,0,2) AS appcountry,
SUBSTR(STRING_AGG(DISTINCT(CAST(bibtable.appday AS STRING))),0,4) AS appyear,
#STRING_AGG(DISTINCT(cit.application_number)) as appnum,
#STRING_AGG(DISTINCT(cit.type)) AS cit_type,
COUNT(main.application_number) AS total_cit_count,
COUNT(DISTINCT(main.family_id)) AS unique_cit_count,
STRING_AGG(DISTINCT(ipc4)) AS ipcs,
STRING_AGG(DISTINCT(title)) AS titles,
STRING_AGG(DISTINCT(applicants)) AS applicants,
ARRAY_AGG(STRUCT(SUBSTR(ipcs.code,0,1) AS ipc_sec,
SUBSTR(main.publication_number,0,2) AS cc,
SUBSTR(CAST(main.filing_date AS STRING),0,4) AS year,
appls)
) AS fcit,
#STRING_AGG(cit.type),STRING_AGG(DISTINCT(family_id))
FROM
`patents-public-data.patents.publications_201912` as main,
UNNEST(main.citation) AS cit,
UNNEST(main.ipc) AS ipcs,
UNNEST(main.assignee) AS appls
LEFT JOIN bibtable
ON cit.publication_number = bibtable.pubnum
WHERE SUBSTR(main.publication_number,0,2) IN ('US','JP','EP','CN','KR','WO') AND pubnum IS NOT NULL AND ipcs.first = TRUE
GROUP BY appnum,pubnum
ORDER BY total_cit_count DESC,pubnum DESC
0.出力結果(上位10位)
実行時間は5分ほど。合計30,262,060件の文献が抽出された。下記は被引用回数上位10位。
※列の後ろの方の「IPC_~」というのは、どのIPCセクションからその文献が引用されているか(回数)、国名はどの国の出願から引用されているか(回数)の数字
Row | pubnum | appcountry | appyear | total_cit_count | unique_cit_count | ipcs | titles | applicants | IPC_A | IPC_B | IPC_C | IPC_D | IPC_E | IPC_F | IPC_G | IPC_H | US | JP | EP | KR | CN | WO |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | US-4683202-A | US | 1985 | 8996 | 4766 | C | Process for amplifying nucleic acid sequences | CETUS CORP | 1602 | 148 | 6116 | 13 | 1 | 12 | 1083 | 21 | 5542 | 0 | 1701 | 1 | 1 | 1751 |
2 | US-4816567-A | US | 1983 | 8668 | 4748 | C | Recombinant immunoglobin preparations | GENENTECH INC | 3319 | 27 | 4354 | 8 | 0 | 3 | 954 | 3 | 4387 | 0 | 1878 | 0 | 1 | 2402 |
3 | US-4683195-A | US | 1986 | 7174 | 3763 | C | Process for amplifying, detecting, and/or-cloning nucleic acid sequences | CETUS CORP | 1115 | 154 | 4853 | 1 | 1 | 11 | 1021 | 18 | 4593 | 0 | 1355 | 1 | 1 | 1224 |
4 | US-5223409-A | US | 1991 | 4956 | 2523 | C | Directed evolution of novel binding proteins | PROTEIN ENG CORP | 1598 | 17 | 2833 | 10 | 0 | 2 | 493 | 3 | 2629 | 2 | 1092 | 1 | 0 | 1232 |
5 | US-5523520-A | US | 1994 | 4759 | 3761 | C | Mutant dwarfism gene of petunia | GOLDSMITH SEEDS INC | 4585 | 1 | 162 | 0 | 2 | 1 | 5 | 3 | 4759 | 0 | 0 | 0 | 0 | 0 |
6 | US-4946778-A | US | 1989 | 4657 | 2474 | C | Single polypeptide chain binding molecules | GENEX CORP | 1655 | 9 | 2436 | 1 | 0 | 1 | 554 | 1 | 2521 | 1 | 1146 | 1 | 0 | 988 |
7 | US-5585089-A | US | 1995 | 4604 | 2520 | A | Humanized immunoglobulins | PROTEIN DESIGN LABS INC | 1829 | 38 | 2306 | 0 | 0 | 1 | 430 | 0 | 2337 | 1 | 976 | 0 | 2 | 1288 |
8 | US-5530101-A | US | 1990 | 4036 | 1987 | A | Humanized immunoglobulins | PROTEIN DESIGN LABS INC | 1776 | 9 | 1920 | 0 | 0 | 1 | 330 | 0 | 2564 | 2 | 672 | 0 | 4 | 794 |
9 | US-5892900-A | US | 1996 | 4035 | 1944 | G | Systems and methods for secure transaction management and electronic rights protection | INTERTRUST TECH CORP | 53 | 26 | 3 | 0 | 1 | 1 | 2886 | 1065 | 3955 | 2 | 19 | 4 | 29 | 26 |
10 | US-2003189401-A1 | US | 2003 | 3849 | 1958 | C | Organic electroluminescent device | INT MFG & ENG SERVICES CO LTD | 2 | 14 | 33 | 1 | 1 | 6 | 759 | 3033 | 3820 | 1 | 16 | 3 | 1 | 8 |
世界一位はUS4683202A
「Process for amplifying nucleic acid sequences」(核酸配列を増幅するプロセス)。バイオ系の出願でしょうか。2位もそれっぽい。
結果をBigQueryのテーブルとして保存。
pythonoからbigqueryを呼び出して、pandasで結果を可視化してみる。
結果
1.全体
被引用回数の分布
- とりあえずヒストグラム。
コード
from google.cloud import bigquery project_id = '~~~~~' client = bigquery.Client(project=project_id) #クエリ query = """ SELECT total_cit_count,count(*) AS fcit_count FROM `bqml_tutorial.fcitdata` GROUP BY total_cit_count ORDER BY total_cit_count DESC """ # pandasで受ける。 df_count_fcit = client.query(query).to_dataframe() #プロット import plotly.express as px px.histogram(df_count_fcit,x='total_cit_count')
こう見ると、被引用回数順になだらかになるのではなくて、3000回あたりに一山あるのが気になる。この傾向は重複なし被引用回数のヒストグラムでも同じ感じ(1800回あたりに一山ある)
被引用年代の分布
- どの年代の文献が一番引用が多いのか?どんどん増加しているけれど、2012~2013年あたりが現時点では最多。
コード
#dataframeにデータを取得する部分は列名変更くらいなので省略。 fig = px.bar(df_count_fcit[df_count_fcit['appyear'].astype(int)>1990],x='appyear',y='count') fig.show()
どの国の出願が被引用が多い?
- もともとのデータが、USは審査官引用+出願人引用(IDS)が入っているのに対し、他の国は入ってないと思われるので、一概に言えないけれど。中国の引用はJPに迫って近年の増加がすごそう。
コード
SELECT appcountry,SUM(unique_cit_count) AS count FROM `bqml_tutorial.fcitdata` GROUP BY appcountry ORDER BY count DESC
fig = px.bar(df_count_fcit[0:10].sort_values(by='count'),y='appcountry',x='count',orientation='h',text='count') fig.show()
被引用が多い出願人
- 名寄せしていない点に注意。電機系は出願件数に比例して多い。
コード
SELECT applicants,COUNT(*) AS count FROM `bqml_tutorial.fcitdata` GROUP BY applicants ORDER BY count DESC
fig = px.bar(df_count_fcit[0:10].sort_values(by='count'),y='applicants',x='count',orientation='h',text='count') fig.show()
2.技術分野別
- 世界中の特許にはIPC(国際特許分類)と言われる技術分類が付与されてます。一番大きい分類であるセクション(さらに筆頭IPC)毎に、被引用回数上位の出願を抽出。 A-H以外のなんか変な分類が入ってきているけれど、各分野の出願が見ていて楽しい。
コード
SELECT * FROM ( SELECT ipcs, row_number() over (partition by ipcs order by unique_cit_count DESC) AS rank, pubnum, total_cit_count,unique_cit_count, applicants, titles FROM `bqml_tutorial.fcitdata` GROUP BY ipcs,pubnum,total_cit_count,unique_cit_count,titles,applicants ) WHERE rank <= 5
ほとんどUS
リスト
セクション rank pubnum total_cit_count unique_cit_count applicants titles A 1 US-5585089-A 4604 2520 PROTEIN DESIGN LABS INC Humanized immunoglobulins A 2 US-5530101-A 4036 1987 PROTEIN DESIGN LABS INC Humanized immunoglobulins A 3 US-4658085-A 2222 1829 UNIV GUELPH Hybridization using cytoplasmic male sterility, cytoplasmic herbicide tolerance, and herbicide tolerance from nuclear genes A 4 US-5693762-A 3455 1822 PROTEIN DESIGN LABS INC Humanized immunoglobulins A 5 US-5569825-A 3082 1730 GENPHARM INT Transgenic non-human animals capable of producing heterologous antibodies of various isotypes B 1 US-2006113549-A1 3391 1756 TOKYO INST TECH Light-emitting device B 2 US-2010092800-A1 3150 1678 CANON KK Substrate for growing wurtzite type crystal and method for manufacturing the same and semiconductor device B 3 US-6336137-B1 1505 940 SIEBEL SYSTEMS INC Web client-server system and method for incompatible page markup and presentation languages B 4 US-4723129-A 1489 928 CANON KK Bubble jet recording method and apparatus in which a heating element generates bubbles in a liquid flow path to project droplets B 5 US-6766817-B2 1856 908 TUBARC TECHNOLOGIES LLC Fluid conduction utilizing a reversible unsaturated siphon with tubarc porosity action C 1 US-4683202-A 8996 4766 CETUS CORP Process for amplifying nucleic acid sequences C 2 US-4816567-A 8668 4748 GENENTECH INC Recombinant immunoglobin preparations C 3 US-4683195-A 7174 3763 CETUS CORP Process for amplifying, detecting, and/or-cloning nucleic acid sequences C 4 US-5523520-A 4759 3761 GOLDSMITH SEEDS INC Mutant dwarfism gene of petunia C 5 US-5223409-A 4956 2523 PROTEIN ENG CORP Directed evolution of novel binding proteins D 1 US-3849241-A 1577 926 EXXON RESEARCH ENGINEERING CO Non-woven mats by melt blowing D 2 US-4340563-A 1491 906 KIMBERLY CLARK CO Method for forming nonwoven webs D 3 US-3802817-A 1282 805 ASAHI CHEMICAL IND Apparatus for producing non-woven fleeces D 4 US-3692618-A 1236 802 METALLGESELLSCHAFT AG Continuous filament nonwoven web D 5 US-4100324-A 1344 775 KIMBERLY CLARK CO Nonwoven fabric and method of producing same e 1 US-7174579-B1 23 13 BAUZA PEDRO Temperature display system E 1 US-4902508-A 1159 493 PURDUE RESEARCH FOUNDATION Tissue graft composition E 2 CN-201962688-U 392 391 FIRST ENGINEERING COMPANY OF CCCC FOURTH HARBOR ENGINEERING CO LTD Diaphram wall wharf pile foundation structure,地下连续墙码头桩基础结构 E 3 US-5809415-A 720 352 OPENWAVE SYS INC Method and architecture for an interactive two-way data communication network E 4 US-5445304-A 709 312 UNITED STATES SURGICAL CORP Safety device for a surgical stapler cartridge E 5 US-7098794-B2 618 279 KIMBERLY CLARK CO Deactivating a data tag for user privacy or tamper-evident packaging F 1 US-7213940-B1 917 506 LED LIGHTING FIXTURES INC Lighting device and lighting method F 2 US-5037397-A 895 384 MEDICAL DISTRIBUTORS INC Universal clamp F 3 US-2010327766-A1 611 378 LEVINE DAVID B,RECKER MICHAEL V Wireless emergency lighting system F 4 US-6577073-B2 569 367 MATSUSHITA ELECTRIC IND CO LTD Led lamp F 5 US-6432098-B1 661 361 PROCTER & GAMBLE Absorbent article fastening device G 1 US-5892900-A 4035 1944 INTERTRUST TECH CORP Systems and methods for secure transaction management and electronic rights protection G 2 US-2007194379-A1 3592 1833 JAPAN SCIENCE & TECH AGENCY Amorphous Oxide And Thin Film Transistor G 3 US-5731856-A 3571 1795 SAMSUNG ELECTRONICS CO LTD Methods for forming liquid crystal displays including thin film transistors and gate pads having a particular structure G 4 US-2006208977-A1 3373 1746 SEMICONDUCTOR ENERGY LAB Semiconductor device, and display device, driving method and electronic apparatus thereof G 5 US-2009280600-A1 3163 1727 JAPAN SCIENCE & TECH AGENCY Amorphous oxide and thin film transistor H 1 US-2006244107-A1 3493 1844 SUGIHARA TOSHINORI,KAWASAKI MASASHI,OHNO HIDEO Semiconductor device, manufacturing method, and electronic device H 2 US-7674650-B2 3529 1837 SEMICONDUCTOR ENERGY LAB Semiconductor device and manufacturing method thereof H 3 US-2006108636-A1 3475 1777 TOKYO INST TECH Amorphous oxide and field effect transistor H 4 US-7061014-B2 3500 1776 JAPAN SCIENCE & TECH AGENCY Natural-superlattice homologous single crystal thin film, method for preparation thereof, and device using said single crystal thin film H 5 US-6294274-B1 3391 1771 TDK CORP,KAWAZOE HIROSHI J 1 US-7693341-B2 36 25 APPLE INC Workflows for color correcting images J 2 US-7847532-B2 20 13 ASTEC INT LTD Centralized controller and power manager for on-board power systems J 3 US-7364473-B2 11 7 FUJITSU LTD Connector for electronic device J 4 US-7239379-B2 1 1 TECHNOLOGY INNOVATIONS LLC Method and apparatus for determining a vertical intensity profile through a plane of focus in a confocal microscope K 1 US-7208984-B1 18 9 LINEAR TECHN INC CMOS driver with minimum shoot-through current K 2 US-7287321-B2 13 7 DENSO CORP Multi-layer board manufacturing method K 3 US-7233174-B2 10 4 TEXAS INSTRUMENTS INC Dual polarity, high input voltage swing comparator using MOS input transistors K 4 US-7752749-B2 2 1 PANASONIC CORP Electronic component mounting method and electronic component mounting device M 1 US-8233880-B2 21 16 JOHNSON GARTH,GLOBAL TEL LINK CORP Integration of cellular phone detection and reporting into a prison telephone system N 1 US-8141117-B1 16 10 LOBECK MATTHEW R,ARRIS GROUP INC,CHRISTENSEN KORY D,CONINGSBY DONNA JO,BROADUS CHARLES R,KELLUM JOHN M PC media center and extension device for interfacing with a personal video recorder through a home network N 2 US-7258343-B2 15 9 BANDAI AMERICA INC Card game and methods of play N 3 US-8134758-B2 4 2 UEDA HIDENORI,NAKAISHI YOSHIAKI,OKI DATA KK Image reading apparatus, image forming apparatus, image forming system that employs the image reading apparatus and the image forming apparatus N 4 US-8198132-B2 2 1 GALERA MANOLITO,ALABIN LEOCADIO MORONA,FAIRCHILD SEMICONDUCTOR Isolated stacked die semiconductor packages O 1 US-8306908-B1 36 20 WEST CORP,PETTAY MARK J,JOHNSON ROBERT E,KEMPKES RODNEY J,BARKER THOMAS B,STRUBBE TODD B Methods and apparatus for intelligent selection of goods and services in telephonic and electronic commerce Q 1 US-7689487-B1 57 36 AMAZON COM INC Computer-assisted funds transfer system Q 2 US-8170916-B1 41 18 AHMED WAQAS,MANOLACHE FLORIN V,AMAZON TECH INC,MONGRAIN SCOTT ALLEN,MUNTEANU VALENTIN RADU,RUDEANU CORNELIU GABRIEL ALEXANDRU,WILSON AARON D,DICKER RUSSELL A,ROSCA VAL DAN DAR ION I Related-item tag suggestions Q 3 US-7130826-B1 5 3 IBM Method and apparatus for conducting coinless transactions Q 4 US-8190451-B2 2 2 DANCHA LYNNE A,TAN AGNES W H,LINDQUIST TAMMIE J,LLOYD KAREN D,GROUP HEALTH PLAN INC,KOOPMEINERS MICHAEL Method and computer program product for predicting and minimizing future behavioral health-related hospital admissions R 1 US-7360912-B1 42 30 PASS & SEYMOUR INC Electrical device with lamp module R 2 US-7196508-B2 24 15 MIRAE CORP Handler for testing semiconductor devices R 3 US-8052455-B1 10 7 HONGFUJIN PREC IND SHENZHEN,HON HAI PREC IND CO LTD Mounting apparatus for flash drive R 4 US-8128432-B2 10 5 REVOL DIDIER,LAROCHE VINCENT,FORATIER NATHALIE,JAOUEN JEAN-MARC,LEGRAND SNC,COUSY JEAN-PIERRE,LEGRAND FRANCE Insert and method of assembling such an insert R 5 US-8002580-B2 5 2 ANDREW LLC Coaxial cable crimp connector V 1 US-7692056-B2 1 1 SHANGHAI RES INST PETROCHEMICAL TECHNOLOGY SINOPEC Process for producing lower olefins from methanol or dimethylether Y 1 US-8471523-B2 1 1 MULTI FUNCTION CO LTD,LIN WEI-JONG Charging/discharging device having concealed universal serial bus plug(s) 3.国別
国別に上位5位を抽出。各国毎に結構相違があるのが面白い。日本は半導体一色。
コード
SELECT * FROM `bqml_tutorial.fcitdata` WHERE appcountry = 'WO'# OR appcountry = 'JP' OR appcountry = 'EP' OR appcountry = 'CN' OR appcountry = 'KR' OR appcountry = 'WO' ORDER BY unique_cit_count DESC LIMIT 5
リスト
Row pubnum appcountry appyear total_cit_count unique_cit_count ipcs titles applicants IPC_A IPC_B IPC_C IPC_D IPC_E IPC_F IPC_G IPC_H US JP EP KR CN WO 1 JP-2002076356-A JP 2000 3084 1758 H 半導体デバイス JAPAN SCIENCE & TECH CORP 1 4 11 2 0 2 691 2373 3017 43 6 4 5 9 2 JP-2004103957-A JP 2002 3110 1746 H ホモロガス薄膜を活性層として用いる透明薄膜電界効果型トランジスタ,Transparent thin film field effect type transistor using homologous thin film as active layer JAPAN SCIENCE & TECH CORP,OTA HIROMICHI 1 4 18 1 0 2 688 2396 2912 93 8 90 0 7 3 JP-2003086808-A JP 2001 3066 1742 H Thin film transistor and matrix display,薄膜トランジスタおよびマトリクス表示装置 SHARP KK,KAWASAKI MASASHI,ONO HIDEO 1 3 14 1 0 2 683 2362 2950 96 0 7 1 12 4 JP-2002289859-A JP 2001 3060 1734 H 薄膜トランジスタ,Thin-film transistor MINOLTA CO LTD 1 3 11 1 0 2 685 2357 2979 57 2 10 6 6 5 JP-2000044236-A JP 1998 3028 1722 H 透明導電性酸化物薄膜を有する物品及びその製造方法,Article having transparent conductive oxide thin film and its production HOYA CORP 1 7 20 1 0 2 680 2317 2991 13 12 3 3 6 applicants IPC_A IPC_B IPC_C IPC_D IPC_E IPC_F IPC_G IPC_H US JP EP KR CN WO 1 US-4683202-A US 1985 8996 4766 C Process for amplifying nucleic acid sequences CETUS CORP 1602 148 6116 13 1 12 1083 21 5542 0 1701 1 1 1751 2 US-4816567-A US 1983 8668 4748 C Recombinant immunoglobin preparations GENENTECH INC 3319 27 4354 8 0 3 954 3 4387 0 1878 0 1 2402 3 US-4683195-A US 1986 7174 3763 C "Process for amplifying, detecting, and/or-cloning nucleic acid sequences" CETUS CORP 1115 154 4853 1 1 11 1021 18 4593 0 1355 1 1 1224 4 US-5523520-A US 1994 4759 3761 C Mutant dwarfism gene of petunia GOLDSMITH SEEDS INC 4585 1 162 0 2 1 5 3 4759 0 0 0 0 0 5 US-5223409-A US 1991 4956 2523 C Directed evolution of novel binding proteins PROTEIN ENG CORP 1598 17 2833 10 0 2 493 3 2629 2 1092 1 0 1232 1 EP-2226847-A2 EP 2005 2388 1510 H "Amorphous oxide and thin film transistor,Oxyde amorphe et transistor à couche mince,Amorpher Oxid- und Dünnschichttransistor" JAPAN SCIENCE & TECH AGENCY 1 3 9 0 0 2 594 1779 2387 0 0 1 0 0 2 EP-0239400-A2 EP 1987 1789 1190 C "Recombinant antibodies and methods for their production,Anticorps recombinants et leurs procédés de production,Rekombinante Antikörper und Verfahren zu deren Herstellung" WINTER GREGORY PAUL 593 1 1066 0 0 0 129 0 572 0 490 1 2 724 3 EP-1737044-A1 EP 2005 1662 1187 H "Oxyde amorphe et transistor film mince,Amorph-oxid- und dünnfilmtransistor,Amorphous oxide and thin film transistor" JAPAN SCIENCE & TECH AGENCY 0 3 7 0 0 2 452 1198 1651 0 4 0 1 6 4 EP-0404097-A2 EP 1990 1568 1152 A "Récepteurs mono- et oligovalents, bispécifiques et oligospécifiques, ainsi que leur production et application,Bispezifische und oligospezifische, mono- und oligovalente Rezeptoren, ihre Herstellung und Verwendung,Bispecific and oligospecific, mono- and oligovalent receptors, production and applications thereof" BEHRINGWERKE AG 505 6 910 0 0 0 147 0 262 0 495 0 0 811 5 EP-1737044-B1 EP 2005 1339 934 H "Amorphes oxid und dünnfilmtransistor,Amorphous oxide and thin film transistor,Oxyde amorphe et transistor film mince" JAPAN SCIENCE & TECH AGENCY 1 0 4 1 0 0 236 1097 1339 0 0 0 0 0 1 CN-201962688-U CN 2010 392 391 E "Diaphram wall wharf pile foundation structure,地下连续墙码头桩基础结构" FIRST ENGINEERING COMPANY OF CCCC FOURTH HARBOR ENGINEERING CO LTD 0 0 0 0 392 0 0 0 0 0 0 0 392 0 2 CN-1895777-A CN 2005 297 296 B "一种组装碳化物的介孔分子筛催化剂及其制备方法,Porous molecular-sieve catalyst for assembling carbide and its preparation" UNIV BEIJING CHEMICAL 0 7 290 0 0 0 0 0 0 0 0 0 297 0 3 CN-1634601-A CN 2003 680 288 A "一种用于医疗器械灭菌的方法,Method for sterilizing medical appliance" JILIN PROVINCE ZHONGLI IND CO 631 8 0 0 0 4 6 31 677 0 0 0 3 0 4 CN-1262969-A CN 2000 290 288 B "Catalyst using TiO2 as carrier to load metal nitride Mo2N,TiO" UNIV NANKAI 0 7 283 0 0 0 0 0 0 0 0 0 290 0 5 CN-1470327-A CN 2002 286 286 C "Metal nitride catalyst preparing method and catalyst,一种金属氮化物催化剂制备方法及催化剂" CHINA PETROCHEMICAL CORP 0 7 279 0 0 0 0 0 0 0 0 0 286 0 1 KR-20000051826-A KR 1999 192 179 C "New organomattalic complex molecule for the fabrication of organic light emitting diodes,신규한 착물 및 그의 제조 방법과 이를 이용한 유기 발광 소자" LG CHEMICAL LTD 0 0 162 0 0 0 0 30 12 0 0 168 0 12 2 KR-20110003229-A KR 2009 379 175 A "하이브리드 수술용 로봇 시스템 및 수술용 로봇 제어방법,Hybrid surgical robot system and control method thereof" ETERNE INC 360 6 0 0 0 1 7 5 366 0 1 4 1 7 3 KR-20000074034-A KR 1999 202 145 H "Ultra-slim Repeater with Variable Attenuator,케이블 손실 보상이 가능한 초소형 중계기" ACE TECH 0 0 0 0 0 0 10 192 197 0 0 4 0 1 4 KR-20120009843-A KR 2010 168 142 G "이동단말기 및 그의 어플리케이션 공유 방법,Mobile terminal and method for sharing applications thereof" LG ELECTRONICS INC 0 0 0 0 0 0 17 151 155 0 0 3 2 8 5 KR-20190000980-A KR 2017 148 141 H "A personal Healthcare System,퍼스널 헬스케어 시스템" LEE DONG WON 15 38 0 1 15 24 19 36 1 0 0 147 0 0 1 WO-2004114391-A1 WO 2004 2859 1667 H "Dispositif a semi-conducteur et son procede de production, et dispositif electronique,Semiconductor device, its manufacturing method, and electronic device,半導体装置およびその製造方法ならびに電子デバイス" "KAWASAKI MASASHI,SHARP KK,SUGIHARA TOSHINORI,OHNO HIDEO" 1 3 10 1 0 2 666 2176 2843 7 4 1 1 3 2 WO-9201047-A1 WO 1991 2421 1413 C "Methods for producing members of specific binding pairs,Procede de production de chainon de paires a liaison specifique" "CAMBRIDGE ANTIBODY TECH,MEDICAL RES COUNCIL" 797 2 1410 0 0 0 212 0 850 2 780 1 4 784 3 WO-9633735-A1 WO 1996 1764 1023 C "Anticorps humains derives d'une xenosouris immunisee,Human antibodies derived from immunized xenomice" CELL GENESYS INC 645 2 1003 0 0 0 114 0 616 29 510 2 1 606 4 WO-9634096-A1 WO 1995 1712 1022 C "Anticorps humains derives de xeno-souris immunisees,Human antibodies derived from immunized xenomice" CELL GENESYS INC 626 2 977 0 0 0 107 0 553 17 523 2 0 617 5 WO-9307278-A1 WO 1992 1137 995 A "Sequence d'adn synthetique ayant une action insecticide accrue dans le mais,Synthetic dna sequence having enhanced insecticidal activity in maize" CIBA GEIGY AG 515 1 616 0 0 0 5 0 81 0 230 0 1 825 4.国をまたぐ引用
例えばUSの公報が、JPの審査で使われるようなパターン。
今回は、5大特許庁と言われているJP,US,CN,EP,KRに国際特許出願(WO)を加えた6つの特許庁の出願を対象にしたので、「本国以外でも使われる出願」というのを抽出。
普通はその国の言語で書かれた文献を引例にするけれど、似た技術が載っている文献がない場合には、外国の文献を持ってくることもある。
国で見るとUSがほとんどだけれど、見た感じ電気通信系(G,H)が多い様子。日本企業が多い印象。技術文書として使い勝手がいいんだろうか?コード
SELECT pubnum, ipcs, total_cit_count,unique_cit_count, applicants, titles, US,JP,EP,CN,KR,WO FROM `bqml_tutorial.fcitdata` WHERE US>0 AND JP>0 AND EP >0 AND CN>0 AND KR >0 AND WO >0 #GROUP BY ipcs,pubnum,total_cit_count,unique_cit_count,titles,applicants ORDER BY total_cit_count DESC LIMIT 10
リスト
pubnum ipcs total_cit_count unique_cit_count applicants titles US JP EP CN KR WO US-5892900-A G 4035 1944 INTERTRUST TECH CORP Systems and methods for secure transaction management and electronic rights protection 3955 2 19 29 4 26 US-2003189401-A1 C 3849 1958 INT MFG & ENG SERVICES CO LTD Organic electroluminescent device 3820 1 16 1 3 8 US-2007194379-A1 G 3592 1833 JAPAN SCIENCE & TECH AGENCY Amorphous Oxide And Thin Film Transistor 3574 1 7 1 2 7 US-2007108446-A1 H 3501 1758 SEMICONDUCTOR ENERGY LAB Semiconductor device and manufacturing method thereof 3464 4 11 7 13 2 US-2006244107-A1 H 3493 1844 SUGIHARA TOSHINORI,KAWASAKI MASASHI,OHNO HIDEO Semiconductor device, manufacturing method, and electronic device 3461 2 12 7 3 8 US-2006108636-A1 H 3475 1777 TOKYO INST TECH Amorphous oxide and field effect transistor 3452 6 10 3 1 3 US-2006110867-A1 H 3454 1752 TOKYO INST TECH Field effect transistor manufacturing method 3426 4 9 5 4 6 US-2006113565-A1 H 3439 1763 TOKYO INST TECH Electric elements and circuits utilizing amorphous oxides 3412 4 3 10 2 8 US-2008038882-A1 H 3401 1734 TAKECHI KAZUSHIGE,NAKATA MITSURU Thin-film device and method of fabricating the same 3342 13 8 24 4 10 US-2007090365-A1 H 3398 1742 CANON KK Field-effect transistor including transparent oxide and light-shielding member, and display utilizing the transistor 3384 3 1 8 1 1 このへんはネットワーク図として可視化したほうがわかりやすいのかもしれない。
5.時間による補正
新しい時代の引用文献は、公開から時間が経過していないので、不利じゃないか、と言われそう、かつ実際に古い時代の文献のほうが引用回数が多い傾向なので、それを少し修正。
本気で取り組むと切断バイアスなどあるのですがそのへんの処理は今回なし。
やたらUS&2008~2009年出願が上位に来てしまった。コード
DECLARE kotoshi INT64; SET kotoshi = CAST(FORMAT_DATE("%Y",CURRENT_DATE()) AS INT64); SELECT pubnum,appcountry,appyear,total_cit_count,unique_cit_count,kotoshi - CAST(appyear AS INT64) AS pub_interval, total_cit_count / (kotoshi - CAST(appyear AS INT64)) AS ave_cit,ipcs, titles FROM `bqml_tutorial.fcitdata` ORDER BY ave_cit DESC LIMIT 20
リスト
pubnum appcountry appyear total_cit_count unique_cit_count pub_interval ave_cit ipcs titles US-2009278122-A1 US 2009 3164 1725 11 287.6 H Amorphous oxide and thin film transistor US-2009280600-A1 US 2009 3163 1727 11 287.5 G Amorphous oxide and thin film transistor US-2010092800-A1 US 2009 3150 1678 11 286.4 B Substrate for growing wurtzite type crystal and method for manufacturing the same and semiconductor device US-2010065844-A1 US 2009 3149 1677 11 286.3 H Thin film transistor and method of manufacturing thin film transistor US-7732819-B2 US 2008 3341 1737 12 278.4 H Semiconductor device and manufacturing method thereof US-2009134399-A1 US 2008 3303 1708 12 275.3 C Semiconductor Device and Method for Manufacturing the Same US-2009073325-A1 US 2008 3298 1705 12 274.8 G "Semiconductor device and method for manufacturing the same, and electric device" US-2008224133-A1 US 2008 3285 1714 12 273.8 H Thin film transistor and organic light-emitting display device having the thin film transistor US-2008258141-A1 US 2008 3279 1720 12 273.3 H "Thin film transistor, method of manufacturing the same, and flat panel display having the same" US-2009068773-A1 US 2008 3275 1709 12 272.9 H Method for fabricating pixel structure of active matrix organic light-emitting diode 6.参考:被引用数が多い非特許文献
被引用文献の中には、非特許文献が含まれるので、抽出してみた(名寄せなし)。
特許文献と比較してだいぶ入れ方が雑だけど・・・
被引用回数1位は、変なものを除いて下記の遺伝子配列のシークエンサの論文ALTSCHUL ET AL., J. MOL. BIOL., vol. 215, 1990, pages 403 - 410
コード
SELECT cit.npl_text AS title, COUNT(cit.npl_text) AS count, STRING_AGG(DISTINCT(SUBSTR(main.publication_number,0,2))) AS cit_country, SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'A' THEN 1 ELSE 0 END) as IPC_A, SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'B' THEN 1 ELSE 0 END) as IPC_B, SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'C' THEN 1 ELSE 0 END) as IPC_C, SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'D' THEN 1 ELSE 0 END) as IPC_D, SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'E' THEN 1 ELSE 0 END) as IPC_E, SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'F' THEN 1 ELSE 0 END) as IPC_F, SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'G' THEN 1 ELSE 0 END) as IPC_G, SUM(CASE SUBSTR(ipcs.code,0,1) WHEN 'H' THEN 1 ELSE 0 END) as IPC_H, SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'US' THEN 1 ELSE 0 END) as US, SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'JP' THEN 1 ELSE 0 END) as JP, SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'EP' THEN 1 ELSE 0 END) as EP, SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'KR' THEN 1 ELSE 0 END) as KR, SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'CN' THEN 1 ELSE 0 END) as CN, SUM(CASE SUBSTR(main.publication_number,0,2) WHEN 'WO' THEN 1 ELSE 0 END) as WO FROM `patents-public-data.patents.publications_201912` as main, UNNEST(main.citation) AS cit, UNNEST(main.ipc) AS ipcs WHERE cit.npl_text NOT IN ('','None','NEANT','NICHTS ERMITTELT','NICHTS-ERMITTELT','No Search','International Search Report.','No further relevant documents disclosed') AND ipcs.first = TRUE GROUP BY cit.npl_text ORDER BY count DESC
リスト
title count cit_country IPC_A IPC_B IPC_C IPC_D IPC_E IPC_F IPC_G IPC_H US JP EP KR CN WO "PATENT ABSTRACTS OF JAPAN vol. 2003, no. 12 5 December 2003 (2003-12-05)" 3694 "FR,NL,EP,LU,CH,GR,WO" 317 805 407 60 59 461 819 766 0 0 1922 0 0 1354 PATENT ABSTRACTS OF JAPAN 3117 "WO,FR,US,DE,NL,EP" 367 659 505 17 73 260 703 533 1 0 1258 0 0 1717 "ALTSCHUL ET AL., J. MOL. BIOL., vol. 215, 1990, pages 403 - 410" 2385 "WO,EP" 516 2 1756 1 0 0 109 1 0 0 964 0 0 1421 "Kraft et al., ""Linkage disequilibrium and fingerprinting in sugar beet,"" Theor Appl Genet, 101:323-326, 2000." 2256 US 2193 0 63 0 0 0 0 0 2256 0 0 0 0 0 "SAMBROOK ET AL.: ""Molecular Cloning: A Laboratory Manual"", 1989, COLD SPRING HARBOR LABORATORY PRESS" 2086 "WO,EP" 528 14 1389 0 0 1 153 1 0 0 987 0 0 1099 "JONES ET AL., NATURE, vol. 321, 1986, pages 522 - 525" 1969 "EP,WO" 584 6 1221 0 0 0 158 0 0 0 881 0 0 1088 "WARD ET AL., NATURE, vol. 341, 1989, pages 544 - 546" 1908 "EP,WO" 504 3 1233 0 0 0 168 0 0 0 816 0 0 1092 "BIRD ET AL., SCIENCE, vol. 242, 1988, pages 423 - 426" 1900 "EP,WO" 555 4 1199 0 0 0 142 0 0 0 739 0 0 1161 "CLACKSON ET AL., NATURE, vol. 352, 1991, pages 624 - 628" 1784 "WO,EP" 536 9 1035 0 0 0 204 0 0 0 728 0 0 1056 "Meghji et al., ""Inbreeding depression, inbred and hybrid grain yields, and other traits of maize genotypes representing three eras,"" Crop Science, 24:545-549, 1984." 1754 US 1698 0 56 0 0 0 0 0 1754 0 0 0 0 0 7.USの出願人引用を除く
US出願はやたら出願人引用が多いので、そこを除く意識でWHERE句の部分を修正。
~~違う部分のみ WHERE SUBSTR(main.publication_number,0,2) IN ('US','JP','EP','CN','KR','WO') AND pubnum IS NOT NULL AND ipcs.first = TRUE AND cit.category IN ('EXA','SEA','ISR')
すると以下のようなランキングに。