Skip to content

Commit 9cf01b5

Browse files
PatyHidalgostaadecker
authored andcommitted
Merge pull request #121 from staadecker/plots
Paper figures, improved graphing code and other improvements
2 parents bb3f4f6 + 41c970c commit 9cf01b5

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

45 files changed

+4514
-455
lines changed

docs/Pandas.md

Lines changed: 37 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,39 @@ where the columns over which we are merging are `key_1` and `key_2`.
108108

109109
- `Series.unique()`: Returns a series where duplicate values are dropped.
110110

111+
## Note on reading switch files
112+
113+
When reading SWITCH csv files, it is recommended to use the following arguments in `pd.read_csv()`.
114+
115+
- `index_col=False`. This forces Pandas to not automatically use the
116+
first column as an index to ensure you are not using custom indexes
117+
(See notes on custom indexes above).
118+
119+
- `dtype={"GENERATION_PROJECT": str}`: If all the generation project IDs happen to be
120+
numbers, then Pandas will automatically set the `GENERATION_PROJECT` column type
121+
to `int`. However, we don't want this since this may cause issues when dealing with
122+
multiple dataframes, some of which have non-numeric IDs. (E.g. if you try merging
123+
a Dataframe where `GENERATION_PROJECT` is an `int` with another where it's a `str`, it
124+
won't work properly.)
125+
126+
- `dtype=str`: An even safer option than `dtype={"GENERATION_PROJECT": str}` is `dtype=str` instead.
127+
This is particularly important when reading a file that will than be re-outputed with minimal changes.
128+
Without this option, there's the risk of floating point values being slightly
129+
modified (see [here](https://github.com/pandas-dev/pandas/issues/16452)) or integer columns
130+
containing na values (`.`) being ["promoted"](https://pandas.pydata.org/pandas-docs/stable/user_guide/gotchas.html?highlight=nan#na-type-promotions)
131+
to floats. Note that with `dtype=str`, all columns are strings so to do mathematical
132+
computation on a column it will first need to be converted with `.astype()`.
133+
134+
- `na_values="."`. Switch uses full stops to indicate an unspecified value. We want Pandas
135+
to interpret full stops as `NaN` rather than the string `.` so that the column type is
136+
still properly interpreted rather than being detected as a string.
137+
138+
Combining these parameters, here is an example of how to read a switch file.
139+
140+
```
141+
df = pd.read_csv("some_SWITCH_file.csv", index_col=False, dtype={"GENERATION_PROJECT": str}, na_values=".")
142+
```
143+
111144
## Example
112145

113146
This example shows how we can use Pandas to generate a more useful view
@@ -117,9 +150,11 @@ of our generation plants from the SWITCH input files.
117150
import pandas as pd
118151

119152
# READ
153+
# See note above on why we use these parameters
120154
kwargs = dict(
121155
index_col=False,
122-
dtype={"GENERATION_PROJECT": str}, # This ensures that the project id column is read as a string not an int
156+
dtype={"GENERATION_PROJECT": str},
157+
na_values=".",
123158
)
124159
gen_projects = pd.read_csv("generation_projects_info.csv", **kwargs)
125160
costs = pd.read_csv("gen_build_costs.csv", **kwargs)
@@ -138,7 +173,7 @@ gen_projects = gen_projects.merge(
138173
)
139174

140175
# FILTER
141-
# When uncommented will filter out all the projects that aren't wind.
176+
# When uncommented, this line will filter out all the projects that aren't wind.
142177
# gen_projects = gen_projects[gen_projects["gen_energy_source"] == "Wind"]
143178

144179
# WRITE
Lines changed: 184 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,184 @@
1+
# %%
2+
3+
# Imports
4+
import pandas as pd
5+
from matplotlib.ticker import PercentFormatter
6+
7+
from papers.Martin_Staadecker_Value_of_LDES_and_Factors.LDES_paper_graphs.util import (
8+
get_scenario,
9+
set_style,
10+
)
11+
from switch_model.tools.graph.main import GraphTools
12+
13+
set_style()
14+
15+
# Prepare graph tools
16+
tools = GraphTools(
17+
scenarios=[
18+
get_scenario("1342", name=1.94),
19+
get_scenario("M7", name=2),
20+
get_scenario("M6", name=4),
21+
get_scenario("M5", name=8),
22+
get_scenario("M4", name=16),
23+
get_scenario("M3", name=32),
24+
get_scenario("M2", name=64),
25+
]
26+
)
27+
tools.pre_graphing(multi_scenario=True)
28+
29+
# Specify formatting and get figure
30+
fig = tools.get_figure(size=(12, 12))
31+
ax1 = fig.add_subplot(2, 2, 1)
32+
ax2 = fig.add_subplot(2, 2, 2)
33+
ax3 = fig.add_subplot(2, 2, 3)
34+
ax4 = fig.add_subplot(2, 2, 4)
35+
36+
# %%
37+
38+
ax = ax1
39+
ax.clear()
40+
ax.tick_params(top=False, bottom=False, right=False, left=False, which="major")
41+
42+
df = tools.get_dataframe(
43+
"load_balance.csv",
44+
usecols=[
45+
"timestamp",
46+
"normalized_energy_balance_duals_dollar_per_mwh",
47+
"scenario_name",
48+
],
49+
).rename(columns={"normalized_energy_balance_duals_dollar_per_mwh": "value"})
50+
# df = df[df["scenario_name"] != "1.94"]
51+
df = tools.transform.timestamp(df)
52+
df = df.groupby(["scenario_name", "hour"], as_index=False)["value"].mean()
53+
df = df.pivot(index="hour", columns="scenario_name", values="value")
54+
df = df.rename_axis("Storage Capacity (TWh)", axis=1)
55+
df.loc[24] = df.loc[0]
56+
df *= 0.1 # Convert from $/MWh to cents/kWh
57+
df.plot(
58+
ax=ax,
59+
colormap="viridis",
60+
xlabel="Time of Day (PST)",
61+
marker=".",
62+
ylabel="Normalized Duals (\xa2/kWh)",
63+
)
64+
ax.set_xlim(0, 24)
65+
ax.set_ylim(0, df.max().max() * 1.05)
66+
ax.set_title("A. Mean Energy Balance Duals by Time of Day")
67+
ax.set_xticks([0, 4, 8, 12, 16, 20, 24])
68+
# %%
69+
ax = ax2
70+
ax.clear()
71+
ax.tick_params(top=False, bottom=False, right=False, left=False, which="both")
72+
73+
df = tools.get_dataframe(
74+
"load_balance.csv",
75+
usecols=[
76+
"timestamp",
77+
"normalized_energy_balance_duals_dollar_per_mwh",
78+
"scenario_name",
79+
],
80+
).rename(columns={"normalized_energy_balance_duals_dollar_per_mwh": "value"})
81+
# df = df[df["scenario_name"] != "1.94"]
82+
df = df.groupby(["scenario_name", "timestamp"], as_index=False).mean()
83+
df = tools.transform.timestamp(df)
84+
df = df.set_index("datetime")
85+
df = (
86+
df.groupby("scenario_name", as_index=False)
87+
.rolling("7D", center=True)["value"]
88+
.mean()
89+
)
90+
df = df.unstack("scenario_name").rename_axis("Storage Capacity (TWh)", axis=1)
91+
# Convert from $/MWh to cents/kWh
92+
df *= 0.1
93+
df.plot(
94+
ax=ax,
95+
colormap="viridis",
96+
xlabel="Month of Year",
97+
ylabel="Normalized Duals (\xa2/kWh)",
98+
)
99+
ax.set_title("B. Mean Energy Balance Duals Throughout the Year")
100+
# %%
101+
102+
ax = ax3
103+
ax.clear()
104+
ax.tick_params(top=False, bottom=False, right=False, left=False, which="both")
105+
106+
# Calculate transmission
107+
tx = tools.get_dataframe(
108+
"transmission.csv",
109+
usecols=["BuildTx", "trans_length_km", "scenario_name"],
110+
convert_dot_to_na=True,
111+
).fillna(0)
112+
tx["BuildTx"] *= tx["trans_length_km"]
113+
tx = tx.groupby("scenario_name")["BuildTx"].sum().rename("Transmission")
114+
115+
# Get new buildout
116+
buildout = tools.get_dataframe("BuildGen.csv").rename(
117+
columns={"GEN_BLD_YRS_1": "GENERATION_PROJECT"}
118+
)
119+
# Keep only latest year
120+
buildout = buildout[buildout["GEN_BLD_YRS_2"] == 2050]
121+
# Merge with projects to get gen_type
122+
projects = tools.get_dataframe(
123+
"generation_projects_info.csv",
124+
from_inputs=True,
125+
usecols=["GENERATION_PROJECT", "gen_tech", "gen_energy_source", "scenario_name"],
126+
)
127+
buildout = buildout.merge(
128+
projects,
129+
on=["GENERATION_PROJECT", "scenario_name"],
130+
validate="one_to_one",
131+
how="left",
132+
)
133+
del projects
134+
buildout = tools.transform.gen_type(buildout)
135+
# Filter out storage since it's not considered generation
136+
buildout = buildout[buildout["gen_type"] != "Storage"]
137+
# Sum accross the entire scenario
138+
buildout = buildout.groupby("scenario_name")["BuildGen"].sum().rename("Generation")
139+
140+
# Merge into same dataframe
141+
df = pd.concat([tx, buildout], axis=1)
142+
143+
# Convert to percent against baseline
144+
df = (df / df.iloc[0] - 1) * 100
145+
146+
# Plot
147+
df.plot(ax=ax, marker=".")
148+
ax.set_ylabel("Change in Capacity Built Compared to Baseline")
149+
ax.yaxis.set_major_formatter(PercentFormatter())
150+
ax.set_xlabel("WECC-wide Storage Capacity (TWh)")
151+
ax.set_title("C. Impact of Storage on Transmission & Generation Investments")
152+
ax.set_ylim(-100, 0)
153+
# %%
154+
155+
# Read dispatch.csv
156+
ax = ax4
157+
ax.clear()
158+
df = tools.get_dataframe(
159+
"dispatch.csv",
160+
usecols=[
161+
"gen_tech",
162+
"gen_energy_source",
163+
"Curtailment_MW",
164+
"is_renewable",
165+
"tp_weight_in_year_hrs",
166+
"scenario_name",
167+
],
168+
na_filter=False, # For performance
169+
)
170+
# Keep only renewable
171+
df = df[df["is_renewable"]]
172+
# Add the gen_type column
173+
df = tools.transform.gen_type(df)
174+
# Convert to GW
175+
df["value"] = df["Curtailment_MW"] * df["tp_weight_in_year_hrs"] / 1000
176+
df = df.groupby(["scenario_name", "gen_type"], as_index=False).value.sum()
177+
df = df.pivot(index="scenario_name", columns="gen_type", values="value")
178+
df /= 1000
179+
df = df.rename_axis("Technology", axis=1)
180+
df.plot(ax=ax, color=tools.get_colors(), marker=".")
181+
ax.set_ylabel("Yearly Curtailment (GWh)")
182+
ax.set_xlabel("WECC-wide Storage Capacity (TWh)")
183+
ax.set_title("D. Impact of Storage on Curtailment")
184+
ax.tick_params(top=False, bottom=False, right=False, left=False)

0 commit comments

Comments
 (0)