Pythonic从这个文本文件中提取值的方法
作者:互联网
我有一个遗留软件的输出文件,如下所示.我想从中提取值,例如,我可以将名为direct_solar_irradiance的变量设置为648.957,并将目标地面压力设置为1013.00.
到目前为止,我一直在提取单个行并像下面那样处理它们(对于我想要提取的不同值重复多次):
values = lines[97].split()
self.irradiance_direct, self.irradiance_diffuse, self.irradiance_env = values
但是,我现在发现,当选择某些参数时,会在输出的中间添加额外的行.这意味着,当然第97行将不再具有我需要的值.
鉴于在某些情况下可能会在输出中添加额外的行,是否有一种好的Pythonic方法来提取这些值?我想我需要在文件中搜索已知的文本片段,然后提取它们所引用的数字,但我能想到的唯一方法是非常笨重的.
所以:
>有没有一种很好的Pythonic方法来搜索这些字符串并提取我想要的值?
>如果没有,还有其他方法明智地做到这一点吗? (例如,某种很酷的文本文件解析库,我一无所知).
******************************* 6sV version 1.0B ******************************
* *
* geometrical conditions identity *
* ------------------------------- *
* user defined conditions *
* *
* month: 14 day : 1 *
* solar zenith angle: 10.00 deg solar azimuthal angle: 20.00 deg *
* view zenith angle: 30.00 deg view azimuthal angle: 40.00 deg *
* scattering angle: 159.14 deg azimuthal angle difference: 20.00 deg *
* *
* atmospheric model description *
* ----------------------------- *
* atmospheric model identity : *
* midlatitude summer (uh2o=2.93g/cm2,uo3=.319cm-atm) *
* aerosols type identity : *
* Maritime aerosol model *
* optical condition identity : *
* visibility : 8.49 km opt. thick. 550 nm : 0.5000 *
* *
* spectral condition *
* ------------------ *
* monochromatic calculation at wl 0.400 micron *
* *
* Surface polarization parameters *
* ---------------------------------- *
* *
* *
* Surface Polarization Q,U,Rop,Chi 0.00000 0.00000 0.00000 0.00 *
* *
* *
* target type *
* ----------- *
* homogeneous ground *
* monochromatic reflectance 1.000 *
* *
* target elevation description *
* ---------------------------- *
* ground pressure [mb] 1013.00 *
* ground altitude [km] 0.000 *
* *
* plane simulation description *
* ---------------------------- *
* plane pressure [mb] 1013.00 *
* plane altitude absolute [km] 0.000 *
* atmosphere under plane description: *
* ozone content 0.000 *
* h2o content 0.000 *
* aerosol opt. thick. 550nm 0.000 *
* *
* atmospheric correction activated *
* -------------------------------- *
* BRDF coupling correction *
* input apparent reflectance : 0.500 *
* *
*******************************************************************************
*******************************************************************************
* *
* integrated values of : *
* -------------------- *
* *
* apparent reflectance 1.1287696 appar. rad.(w/m2/sr/mic) 588.646 *
* total gaseous transmittance 1.000 *
* *
*******************************************************************************
* *
* coupling aerosol -wv : *
* -------------------- *
* wv above aerosol : 1.129 wv mixed with aerosol : 1.129 *
* wv under aerosol : 1.129 *
*******************************************************************************
* *
* integrated values of : *
* -------------------- *
* *
* app. polarized refl. 0.0000 app. pol. rad. (w/m2/sr/mic) 0.000 *
* direction of the plane of polarization 0.00 *
* total polarization ratio 0.000 *
* *
*******************************************************************************
* *
* int. normalized values of : *
* --------------------------- *
* % of irradiance at ground level *
* % of direct irr. % of diffuse irr. % of enviro. irr *
* 0.351 0.354 0.295 *
* reflectance at satellite level *
* atm. intrin. ref. background ref. pixel reflectance *
* 0.000 0.000 1.129 *
* *
* int. absolute values of *
* ----------------------- *
* irr. at ground level (w/m2/mic) *
* direct solar irr. atm. diffuse irr. environment irr *
* 648.957 655.412 544.918 *
* rad at satel. level (w/m2/sr/mic) *
* atm. intrin. rad. background rad. pixel radiance *
* 0.000 0.000 588.646 *
* *
* *
* sol. spect (in w/m2/mic) *
* 1663.594 *
* *
*******************************************************************************
*******************************************************************************
* *
* integrated values of : *
* -------------------- *
* *
* downward upward total *
* global gas. trans. : 1.00000 1.00000 1.00000 *
* water " " : 1.00000 1.00000 1.00000 *
* ozone " " : 1.00000 1.00000 1.00000 *
* co2 " " : 1.00000 1.00000 1.00000 *
* oxyg " " : 1.00000 1.00000 1.00000 *
* no2 " " : 1.00000 1.00000 1.00000 *
* ch4 " " : 1.00000 1.00000 1.00000 *
* co " " : 1.00000 1.00000 1.00000 *
* *
* *
* rayl. sca. trans. : 0.84422 1.00000 0.84422 *
* aeros. sca. " : 0.94572 1.00000 0.94572 *
* total sca. " : 0.79616 1.00000 0.79616 *
* *
* *
* *
* rayleigh aerosols total *
* *
* spherical albedo : 0.23410 0.12354 0.29466 *
* optical depth total: 0.36193 0.55006 0.91199 *
* optical depth plane: 0.00000 0.00000 0.00000 *
* reflectance I : 0.00000 0.00000 0.00000 *
* reflectance Q : 0.00000 0.00000 0.00000 *
* reflectance U : 0.00000 0.00000 0.00000 *
* polarized reflect. : 0.00000 0.00000 0.00000 *
* degree of polar. : nan 0.00 nan *
* dir. plane polar. : -45.00 -45.00 -45.00 *
* phase function I : 1.38819 0.27621 0.71751 *
* phase function Q : -0.09117 -0.00856 -0.04134 *
* phase function U : -1.34383 0.02142 -0.52039 *
* primary deg. of pol: -0.06567 -0.03099 -0.05762 *
* sing. scat. albedo : 1.00000 0.98774 0.99261 *
* *
* *
*******************************************************************************
*******************************************************************************
*******************************************************************************
* atmospheric correction result *
* ----------------------------- *
* input apparent reflectance : 0.500 *
* measured radiance [w/m2/sr/mic] : 260.747 *
* atmospherically corrected reflectance *
* Lambertian case : 0.52995 *
* BRDF case : 0.52995 *
* coefficients xa xb xc : 0.00241 0.00000 0.29466 *
* y=xa*(measured radiance)-xb; acr=y/(1.+xc*y) *
解决方法:
你可以抛出自己的迷你语言,即自动提取.
我做了以下操作来自动解析专有程序输出
# will match in the order written here
tokens = ["num_ref_frames", "Max QP", "Min QP", "Avg QP", "I4x4",
"I16x16", "SkipZero", "SkipMV", "16x16", "16x8", "8x16",
"8x8", "8x4", "4x8", "4x4"]
special = ["Quarterpel MVs"]
# this dictionary (hash-table) contains the search string from tokens array
# as well as an array where the first element is the field to extract to
# create matrix array. e.g. 0 = 1st field, 1 = 2nd field, 3 = 3rd field etc.
dict = {tokens[0]: [1], tokens[1]: [1], tokens[2]: [1], tokens[3]: [1],
tokens[4]: [2], tokens[5]: [2], tokens[6]: [2], tokens[7]: [2],
tokens[8]: [2], tokens[9]: [2], tokens[10]: [2], tokens[11]: [2],
tokens[12]: [2], tokens[13]: [2], tokens[14]: [2],}
然后我简单地循环输入,并为每一行检查令牌的内容;如果找到匹配,我根据dict-entry进行拆分以提取正确的字段.
上面的特殊处理是一个需要从多行读取的特殊变量.
更新
克隆git://gist.github.com/1037403.git获取代码的副本
usage:
./parser.py all_dec.txt
希望能帮助到你!
标签:python,idioms,text 来源: https://codeday.me/bug/20190530/1186934.html